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Preface 


It was somewhat of a surprise to the scientific community when, in 1944, Oswald 
Avery definitively proved that DNA encoded the blueprint to life. Many scientists 
at the time thought that, with just four bases, DNA was chemically too simple to 
contain so much information. Nearly 75 years later, though, we are still trying to 
parse all the information contained in a genome. This work has been greatly 
accelerated in the past decade by two parallel advancements: next-generation 
DNA sequencing technology and genome editing methods. Current sequencing 
capacity is leading to the generation of large amounts of genetic data, while our 
ability to manipulate the genome is rapidly advancing our understanding of that 
genetic data. 

Genome editing based on the microbial CRISPR-Cas adaptive immune system 
has emerged in recent years as a powerful tool for dissecting genetic circuits. 
CRISPR-associated enzymes such as Cas9 and Cpfl are RNA-guided DNA endo- 
nucleases that can be precisely targeted to nearly any region of the genome via the 
guide RNA sequence. These enzymes have been used for both gene disruption and 
insertion in a wide range of organisms, and they have also been developed as a 
platform for gene activation, providing another way to modulate gene expression 
patterns. Finally, RNA-guided nucleases can facilitate both loss- and gain-of- 
function genome-wide screening applications. This technology has significantly 
advanced our ability to perform forward genetics in mammalian systems, model 
human diseases in tractable systems, and interrogate complex genetic processes. 
Moreover, it has the potential to revolutionize the way we treat human disease. 

The Fondation IPSEN Colloque Médecine et Recherche in the Neuroscience 
Series, held in Paris on April 22, 2016, highlighted how genome editing is enabling 
breakthroughs in how we study the brain and how we may be able to apply this 
powerful method to understand and treat central nervous system (CNS) disorders. 
The use of CRISPR-Cas-based technologies was a common thread that ran through- 
out the meeting: it was used to either develop new cell lines relevant to studying the 
CNS or it made it possible to use new model organisms to study the CNS; it 
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powered large-scale interrogation of neuronal genetic circuits; and it was used for 
proof-of-principle therapeutic restoration of disease-causing mutations. 

In contrast to Avery’s discovery, nobody has ever doubted the complexity of the 
human brain. Neuroscientists have struggled for decades with seemingly intractable 
questions about the nature of the brain, and CNS disorders have proven to be some 
of the most difficult human diseases to study, in large part because the tools simply 
were not available. Genome editing, along with other recent technological advances 
such as next-generation sequencing advances and optogenetics, is unlocking hun- 
dreds of new ways to study the brain. The work that is described in this volume 
exemplifies the lines of research that can now be pursued and offers a tantalizing 
glimpse of where this work will lead us. 


MA, USA Feng Zhang 
June 2017 
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In Vitro Modeling of Complex Neurological 
Diseases 


Frank Soldner and Rudolf Jaenisch 


Abstract A major reason for the lack of effective therapeutics and a deep biolog- 
ical understanding of complex diseases, which are thought to result from a complex 
interaction between genetic and environmental risk factors, is the paucity of 
relevant experimental models. This review describes a novel experimental 
approach that allows the study of the functional effects of disease-associated risk 
in complex disease by combining genome wide association studies (GWAS) and 
genome-scale epigenetic data to prioritize disease-associated risk variants with 
efficient gene editing technologies in human pluripotent stem cells (hPSCs). As a 
proof of principle, we recently used such a genetically precisely controlled exper- 
imental system to identify a common Parkinson’s disease-associated risk variant in 
a non-coding distal enhancer element that alters the binding of transcription factors 
and regulates the expression of «-synuclein (SNCA), a key gene implicated in the 
pathogenesis of Parkinson’s disease. 


Introduction 


One of the main challenges to understanding the onset and progression of human 
disease is to develop effective model systems that combine known genetic elements 
with disease-associated phenotypic readouts. The identification of genes linked to 
familial forms of diseases such as cystic fibrosis, sickle cell anemia or monogenetic 
forms of neurodegenerative disorders has fundamentally changed our understand- 
ing of many diseases and provided vital clues into the underlying pathogenesis 
(Botstein and Risch 2003; Altshuler et al. 2008; McClellan and King 2010). 
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Detailed knowledge of disease-causing mutations and genes allows the establish- 
ment of reliable and disease-relevant cellular and animal models and facilitates the 
systematic analysis of molecular and cellular disease mechanisms and the devel- 
opment and validation of novel and effective therapeutic approaches. 

In contrast to such predominantly rare and monogenic disorders, the majority of 
the most common medical conditions, such as obesity, heart disease, diabetes, 
autoimmune disease or sporadic neurodegenerative disease, have no well-defined 
genetic etiology and do not follow Mendelian inheritance patterns. Population 
genetics suggest that such sporadic or polygenic diseases result from a complex 
interaction between multiple genetic and non-genetic, lifestyle and environmental 
risk factors (Botstein and Risch 2003; Altshuler et al. 2008). The complexity and 
our limited knowledge of the underlying genetic component have largely prevented 
the generation of genetically defined disease models. The paucity of disease- 
relevant experimental systems represents one of the major reasons for our limited 
biological understanding of complex diseases and an almost complete lack of 
disease-modifying effective therapeutics. 

In the following, we will summarize recent progress in genetics and develop- 
mental and molecular biology, which may provide a solution for generating disease- 
relevant in vitro models for complex disease. By combining human pluripotent stem 
cell (hPSC)-technology with genome editing and genome-scale epigenetic and 
genome-wide association studies (GWAS) data to identify disease-associated risk 
variants, we will provide a blueprint to create genetically defined experimental 
model systems that allow the functional analysis of disease-associated risk variants. 
As a proof of principle, we describe how we applied this approach to sporadic 
Parkinson’s disease and identified a common risk variant in a non-coding distal 
enhancer element that regulates the expression of SNCA, a key gene implicated in 
the pathogenesis of Parkinson’s disease (Soldner et al. 2016). 


Induced Pluripotent Stem Cells to Model Complex Diseases 


The ability to reprogram somatic cells into human induced pluripotent stem cells 
(hiPSCs) has opened the intriguing possibility of studying complex human disease in a 
cell culture dish (Takahashi and Yamanaka 2006; Takahashi et al. 2007; Yu et al. 
2007). Following in vitro differentiation, patient-derived hiPSCs provide access to 
large amounts of human disease-relevant cells that carry all the genetic alterations 
involved in disease development (Saha and Jaenisch 2009; Soldner and Jaenisch 2012; 
Takahashi and Yamanaka 2013; Yu et al. 2013). Without precise knowledge of the 
underlying genetics, such patient-derived cells, therefore, allow the generation of 
relevant cellular model systems based on disease-associated genetic elements. This 
approach has already been used to model a range of primarily monogenetic diseases, 
including neurodegenerative diseases such as Alzheimer’s disease, Parkinson’s dis- 
ease and amyotrophic lateral sclerosis (ALS; Cooper et al. 2012; Israel et al. 2012; 
Reinhardt et al. 2013; Alami et al. 2014; Wainger et al. 2014; Young et al. 2015). 
Despite the unprecedented potential and excitement of this approach, it became 
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apparent that individual hiPSC lines, independent of disease status or genotype, 
displayed highly variable biological properties in vitro, such as the propensity to 
differentiate into functional cell types (Bock et al. 2011; Boulting et al. 2011; Soldner 
and Jaenisch 2012; Nishizawa et al. 2016). This observation significantly limits their 
value to identify robust disease-associated phenotypes by simply comparing patient- 
derived cells with unrelated controls. This system-immanent variability has proven to 
be particular challenging in the context of age-related diseases including neurodegen- 
erative diseases such as Alzheimer’s and Parkinson’s disease, considering that disease- 
associated phenotypes typically progress slowly over many years in patients, which 
suggests that expected in vitro phenotypes would be rather mild and subtle. The 
reasons for the observed cell-to-cell differences include genetic background varia- 
tions, genetic and epigenetic changes resulting from reprogramming and extended 
maintenance of hiPSCs and the lack of robust in vitro differentiation protocols 
(Soldner and Jaenisch 2012; Liang and Zhang 2013). 

Some of the above-described limitations have been overcome by improved 
reprogramming and culture conditions (Warren et al. 2010; Hou et al. 2013), 
directed differentiation approaches including transcription factor-induced 
reprogramming (Zhang et al. 2013), insertion of cell type-specific fluorescent 
marker proteins to monitor differentiation (Di Giorgio et al. 2008; Hockemeyer 
et al. 2009, 2011; Chambers et al. 2012; Mica et al. 2013) or by consortium-size 
experiments to significantly increase the number of independent experimental 
samples (The HD iPSC Consortium 2012). However, variable genetic backgrounds 
between patient-derived and control cells remain an unresolved major limitation of 
the current hiPSC approach, due to the well-established influence of 
uncharacterized genetic modifiers on disease development and progression in 
patients and, accordingly, on disease-associated phenotypes in vitro. 


Gene Editing to Generate Genetically Controlled Disease 
Models 


The recent progress in gene editing technologies by using engineered nucleases such 
as meganucleases, zinc finger nucleases (ZFNs), transcription activator-like effector- 
based nucleases (TALEN) and the CRISPR/Cas9 system is thought to provide an 
elegant solution to control for differences in genetic background (Soldner et al. 2011; 
Soldner and Jaenisch 2012; Hockemeyer and Jaenisch 2016). In particular, the 
simplicity and ease of the CRISPR/Cas9 system to efficiently modify the genome 
in human cells, even at multiple loci simultaneously, allow us to engineer genetically 
controlled hPSC lines that differ only at known genetic disease-causing variants 
(Jinek et al. 2012, 2013; Cong et al. 2013; Mali et al. 2013). 

As a proof of principle, we recently used ZFNs to either seamlessly correct 
Parkinson’s disease-associated mutations in the SNCA gene in patient-derived 
hiPSCs or to insert similar variants into wild-type human embryonic stem cells 
(hESCs; Soldner et al. 2011). Such isogenic pairs of hPSC lines provided an 
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experimental system with a controlled genetic background in which the 
engineered disease-associated risk variants were the only experimental variables. 
Analyzing disease-associated phenotypes in this genetically controlled system 
allowed identification of nitrosative stress, accumulation of endoplasmic reticu- 
lum (ER)-associated degradation substrates, and ER stress as early Parkinson’s 
disease-associated pathological phenotypes (Chung et al. 2013). A further study 
revealed that nitrosative and oxidative stress result in S-nitrosilation of the 
transcription factor MEF2C and inhibition of the MEF2C-PGC1a transcriptional 
network contributing to mitochondrial dysfunction and apoptotic neuronal cell 
death (Ryan et al. 2013). By combining this monogenic disease model with 
disease-associated environmental stressors, the experiments further provide new 
mechanistic insight into gene-environmental (GxE) interaction in the pathogene- 
sis of Parkinson’s disease (Ryan et al. 2013). Notably, both studies relying on a 
genetically controlled in vivo model identified novel therapeutic targets and small 
molecules that reversed the observed pathological phenotypes in neurons, which 
are currently perused as novel therapeutics for Parkinson’s disease (Chung et al. 
2013; Ryan et al. 2013). The above-described approach clearly overcomes many 
of the limitations of the current hiPSC technology. Due to the simplicity of the 
CRISPR/Cas9 system to efficiently edit the genome in hiPSCs, the use of isogenic 
cell lines is becoming the gold standard for analyzing disease-associated pheno- 
types in vitro (Reinhardt et al. 2013; Kiskinis et al. 2014; Paquet et al. 2016). 
However, such an approach seems currently limited to monogenetic diseases in 
which the disease-causing genetic alterations are well established and the 
expected disease-associated phenotypes display robust and highly penetrant 
effects. 


Functional Role of GWAS-Identified Risk Variants 
in Complex Disease 


Translating the concept of engineering genetically controlled model systems to 
complex disease seems daunting and will require a detailed understanding of the 
underlying genetic component. GWAS and genome-scale next generation sequenc- 
ing (NGS) approaches have significantly advanced our understanding of the genetic 
basis of complex disease. GWAS in particular have identified numerous common 
single-nucleotide polymorphisms (SNPs) associated with human traits and dis- 
eases, pinpointing the genomic loci and genes thought to play important roles in 
the pathophysiology of the respective diseases (Botstein and Risch 2003; Altshuler 
et al. 2008; McClellan and King 2010). 

However, the interpretation of this permanently increasing amount of data is 
limited by the fact that disease-associated SNPs only statistically correlate with the 
underlying disease and the vast majority of risk variants have no established 
biological relevance to disease or clinical utility for prognosis or treatment 
(Altshuler et al. 2008; McClellan and King 2010). Any SNP in linkage 
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disequilibrium (LD) with a GWAS-identified risk variant is equally likely to be 
causative for the risk to develop a specific disease. It has therefore been difficult to 
distinguish variants that are functional and disease-relevant from those that are in 
LD and thus only mark the underlying haplotype containing the functional variant. 
Advancing from genetic association to causal biologic processes has been chal- 
lenging for two additional reasons. First, the majority of disease-associated genetic 
variants fall into the non-coding part of the genome, which impedes any functional 
analysis through simple transgenic overexpression or disruption in established cell 
lines or any analysis in non-human model systems due to the limited conservation 
of non-coding elements between species. Second, the prevailing hypothesis about 
the heritability of complex diseases suggests that multiple common or potentially 
rare SNPs cooperatively contribute to the risk of developing a specific disease; 
however, each individual risk variant will have only a small or at most medium-size 
additive or multiplicative effect on disease phenotypes (Gibson 2012). Indeed, 
disease-associated genetic variants are also prevalent in the healthy population, 
although with lower frequency, and the majority of carriers of risk SNPs do not 
develop a disease, implying that individual risk variants are not sufficient to cause 
disease-associated phenotypes. Consequently, only very few risk variants have 
been functionally linked to specific diseases, such as a common polymorphism at 
the 1q13 locus, which alters the expression of the SORT1 gene and is correlated 
with both plasma low-density lipoprotein cholesterol (LDL-C) and myocardial 
infarction (Musunuru et al. 2010). 

Under the assumption that specific risk haplotypes contribute through 
dysregulation of the same molecular pathways to disease risk, a current approach 
suggests that we stratify patient-derived hiPSCs according to specific genetic risk 
variants rather than according to disease status. This approach may be sufficient in 
some cases to reduce the genetic heterogeneity based on known disease haplotypes 
and to reveal previously masked disease-associated phenotypes. Indeed, this 
approach was successfully used to dissect the function of a common Alzheimer’s 
disease-associated non-coding genetic variant in the 5’ region of the SORL1 
(sortilin related receptor 1; Young et al. 2015). However, the main limitation of 
this approach remains the uncontrolled effect of additional genetic modifiers and the 
inability to identify the specific causative sequence variant that is required for further 
functional analysis. 


Epigenomic Signatures to Prioritize GWAS-Identified Risk 
Variants 


Cis-acting effects of genetic variants on gene expression have been proposed to 
be a major factor for phenotypic variation of complex traits and disease suscep- 
tibility (Schadt et al. 2003; Morley et al. 2004; Cheung et al. 2005, 2010; Lee and 
Young 2013; GTEx Consortium 2015). The widespread availability of cell- and 
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tissue-specific transcriptome-wide expression data along with the corresponding 
genotyping data has greatly facilitated the identification of expression quantitative 
trait loci (C@QTLs; GTEx Consortium 2015). Although able to detect statistical 
correlation between specific risk variants and gene expression, this approach 
entails limitations that are comparable to traditional GWAS in identifying the 
functional risk variants. Recent genome-scale epigenetic studies such as the 
ENCODE (ENCODE Project Consortium 2012) and Roadmap Epigenomics 
project (Roadmap Epigenomics Consortium 2015) have allowed us to reliably 
identify and catalogue regulatory elements in a cell type-, tissue- and in some 
cases disease-specific manner. These studies specifically have highlighted the 
enrichment of GWAS-identified risk variants in regulatory DNA elements specific 
to tissues and cell types (Ernst et al. 2011; Degner et al. 2012; Maurano et al. 
2012; Hnisz et al. 2013; Trynka et al. 2013; Farh et al. 2014; Pasquali et al. 2014; 
Ripke et al. 2014) affected by the respective diseases. These results suggest that 
disease-associated risk variants may affect gene regulation by modifying the 
function of tissue-specific regulatory elements. In particular, distal enhancer 
elements that are bound by key transcription factors (TFs) and known to precisely 
control spatial and temporal gene expression during embryonic development and 
tissue homeostasis in a cell type-specific manner (Ward and Kellis 2012; Lee and 
Young 2013; Farh et al. 2014; Ripke et al. 2014; Wamstad et al. 2014) are found 
to be enriched for GWAS variants in many complex diseases. 

A number of recent studies have correlated changes in TF binding in enhancer 
regions with sequence-specific, heritable changes in chromatin state and gene 
regulation (Kasowski et al. 2013; Kilpinen et al. 2013; McVicker et al. 2013), 
thus providing a molecular mechanism for how individual sequence variants 
contribute to the development of complex diseases. Recent progress in defining 
TF binding specificities using high throughput SELEX and chromatin immuno- 
precipitation sequencing (ChIP-seq) approaches has largely increased our under- 
standing of sequence-specific TF binding in the genome and significantly 
improved our ability to analyze or predict TF binding on a genome-wide scale 
(Jolma et al. 2013, 2015). Based on the rapidly increasing availability of epige- 
netic data, mapping of GWAS-identified variants to TF binding sites within 
tissue-specific enhancer elements has been proposed as a valuable approach to 
prioritize and identify functional and disease-relevant risk variants (Ward and 
Kellis 2012; Rivera and Ren 2013; Claussnitzer et al. 2014; Wamstad et al. 
2014). Indeed, such integration of GWAS with epigenetic signatures for heart- 
specific enhancers allowed for the identification of novel functional risk variants 
for cardiac phenotypes (Wang et al. 2016). Likewise, a similar approach identi- 
fied an obesity-associated risk variant in the FTO locus, which alters early 
adipose differentiation by disrupting a TF binding site at a pre-adipocyte-specific 
enhancer (Claussnitzer et al. 2015). 

The 3-dimensional (3D) organization of the genome is thought to contribute to 
the regulation of gene expression (Bickmore 2013; de Graaf and van Steensel 
2013; de Laat and Duboule 2013). The recent development of chromosome 
conformation capture techniques (“3C” and genome-wide 3C-based methods; 
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Dekker et al. 2002, 2013) or cohesin chromatin interaction analysis by paired-end 
tag sequencing (ChIA-PET; Dowen et al. 2014) allow us to determine long-range 
chromatin interactions such as cell type-specific promoter-enhancer interaction. 
These analyses suggest that active enhancer elements are bound by transcription 
factors and loop over long distances to contact target genes to regulate transcrip- 
tion. An emerging model suggests promoter-enhancer interactions typically only 
occur within megabase-sized topological-associated domains (TAD; Dixon et al. 
2012; Nora et al. 2012), as defined by high DNA interaction frequency based on 
genome-wide chromosome capture data or within such TADs in insulated neigh- 
borhoods restricted by cohesin-associated CTCF-CTCF loops (Handoko et al. 
2011; DeMare et al. 2013; Dowen et al. 2014; Rao et al. 2014; Ji et al. 2016). 
Notably, there is mounting evidence that changes in 3D structure, potentially 
through sequence-specific disruption of CTCF interaction, might contribute to 
disease development (Ji et al. 2016). Integrating datasets of cell type-specific 
changes in enhancer-promoter interactions and information about the 3D structure 
of the genome will further help us to assign disease-associated risk variants in 
enhancer sequences to target genes and provide supporting evidence to identify 
functional disease-associated risk variants and deregulated target genes. 


Functional Analysis of Parkinson’s Disease-Associated Risk 
Variants 


As a proof of principle, we describe below how we recently applied the above- 
elucidated approach to sporadic Parkinson’s disease as a prototypical complex 
disorder, to identify common risk variants in non-coding distal enhancer elements 
that functionally modulate the risk to develop the disease (Soldner et al. 2016). 
Parkinson’s disease is the second most common chronic progressive neurodegen- 
erative disease, with a prevalence of more than 1% in the population over the age 
of 60. Although the discovery of genes linked to rare Mendelian forms of PD 
such as SNCA, LRRK2, PARKIN, PINKI and DJI has provided insight into the 
molecular and cellular pathogenesis of the disease (Gasser et al. 2011; Singleton 
et al. 2013), the etiology leading to neuronal cell loss is largely unknown. 
Importantly, over 90% of Parkinson’s cases do not show Mendelian inheritance 
patterns; however, substantial clustering of cases within families suggests that 
sporadic, late age of onset Parkinson’s disease results from a complex interaction 
between genetic risk alleles and environmental factors. A recent GWAS meta- 
analysis has identified 26 genomic loci containing risk variants for sporadic 
Parkinson’s disease (Nalls et al. 2014); however, as for the majority of neurode- 
generative disorders, little mechanistic insight is available on how specific 
sequence variations contribute to disease development and progression. 
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Identification of Parkinson’s Disease-Associated Risk 
Variants in Brain-Specific Enhancer Elements 


A recent analysis of Histone H3 acetylated at lysine 27 (H3K27ac)-marked regions 
in the post-mortem adult brain suggests a significant enrichment of Parkinson’s 
disease-associated risk SNPs within distal enhancer elements (Vermunt et al. 2014). 
This finding supports the hypothesis that sequence-specific changes in enhancer 
function and deregulated transcription of linked genes mediate the risk to develop 
the disease. A number of specific epigenetic modifications, such as p300, mono- 
methylation of Histone H3 at lysine 4 (H3K4me1), H3K27ac and DNase I hyper- 
sensitive sites (DHSs) have been established as surrogate marks to reliably identify 
candidate enhancer sequences (Visel et al. 2009, 2013; Creyghton et al. 2010; 
Rada-Iglesias et al. 2011; Maurano et al. 2012). Thus, to identify specific candidate 
risk variants in distal enhancers, we intersected Parkinson’s disease-associated risk 
SNPs (Nalls et al. 2014) with publicly available epigenetic data (Roadmap 
Epigenomics Consortium 2015). This analysis allowed us to compile a list of risk 
variants ranked by the overlap of active enhancer elements. Interestingly, many of 
the top-ranked risk variants were located to the SVCA locus. Because changes in TF 
binding are thought to be the major mediator of SNP-specific changes in gene 
expression (Kasowski et al. 2013; Kilpinen et al. 2013; McVicker et al. 2013) we 
incorporated this idea to further prioritize the risk variants in enhancers by analyz- 
ing predicted TF binding for known TF binding specificities comparing both 
alternative genotypes for each Parkinson’s disease-associated SNP. This analysis 
highlighted the Parkinson’s disease-associated SNP rs356168 in an enhancer in 
intron-4 of SNCA as the risk variant with the highest number of genotype-dependent 
differential TF binding in the SNCA locus. The functional relevance of this 
enhancer was further supported by chromosome conformation capture data, 
which indicate a physical interaction (looping) between the enhancer and the 
promoter region of SNCA that is thought to be necessary for the cis-acting effects 
on gene expression (Vermunt et al. 2014). 

It is well established that SNCA plays a central role in the pathogenesis of 
Parkinson’s disease. Point mutations in SVCA were the first genetic variants linked 
to familial forms of Parkinson’s disease, and the SNCA protein is the major compo- 
nent of Lewy bodies and Lewy neuritis, which are considered the pathological 
hallmark of familial and sporadic Parkinson’s disease (Gasser et al. 2011; Singleton 
et al. 2013). In addition, the SNCA locus represents one of the strongest Parkinson’s 
disease-associated GWAS hits (Nalls et al. 2014). Notably, multiplication of the 
entire SNCA locus was identified as causal for a rare autosomal-dominant form of 
Parkinson’s disease, indicating that a moderate increase of wild-type SNCA expres- 
sion (1.5 times in the case of genomic duplications) is sufficient to cause an 
autosomal-dominant form of Parkinson’s disease (Singleton et al. 2003; Miller 
et al. 2004; Devine et al. 2011; Kim et al. 2012). This observation is highly 
suggestive of a molecular mechanism by which risk variants in the SNCA locus 
modify the risk to develop Parkinson’s disease by slightly modulating the expression 
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of SNCA. This clear link between SNCA expression and the development of 
Parkinson’s disease in the context of genomic amplification therefore provides a 
good rationale for gene expression as a disease-relevant phenotypic readout to 
connect genetic variation to disease risk (Devine et al. 2011). Indeed, the first 
indication that the SNCA locus may contain risk alleles that modulate SNCA 
expression came from the identification of SNCA-Rep1, a complex polymorphic 
microsatellite repeat region approximately 10 kb upstream of the transcription start 
site. Multiple candidate gene association studies suggested that individuals who are 
homozygous for a shorter, “protective” repeat region (Rep1-257 or Rep1-259) have 
a significantly lower risk of developing Parkinson’s disease compared to individuals 
carrying the longer forms (Rep1-261 or Rep1-263; Kruger et al. 1999; Maraganore 
et al. 2006). Several functional studies, including the analysis of transgenic mice 
carrying different human SNC A Ren! alleles (Chiba-Falek et al. 2005; Cronin et al. 
2009), suggested an “enhancer-like” function of the microsatellite repeat element 
based on the cis-regulatory correlation between the SVCA-Rep/ repeat length and 
SNCA expression. 


Allele-Specific Gene Expression as a Robust Read-Out 
to Analyze Cis-Regulatory Effects 


As explained in detail above, one of the major limitations of using hPSC-derived 
somatic cells to model disease in vitro is the considerable variability of the biological 
properties between individual cell lines. As for SVCA, a gene known to be variable 
between neuronal cell types such as astrocytes, oligodendrocytes and neurons and to be 
regulated during development and terminal differentiation, cellular heterogeneity and 
incomplete maturation significantly interfere with the detection of subtle differences in 
gene expression between distinct risk-genotypes or patient compared to control cells, 
respectively. Indeed, individual in vitro differentiation experiments from genetically 
identical sub-clones resulted in up to fourfold differences in SVCA expression (Soldner 
et al. 2016). To address this problem, we recently described an experimental approach 
that is based on determining the effect of individual regulatory elements on the 
transcription of the cis-regulated gene by analyzing allele-specific gene expression 
(Soldner et al. 2016). The deletion of just a single copy (heterozygous) of a candidate 
regulatory element or its exchange with an alternative disease-associated element 
affects only the gene expression of the cis-regulated gene on the same allele while 
maintaining the expression of the other, homologous allele, unaltered. Consequently, 
allele-specific gene expression would be biased towards lower or higher expression of 
the cis-regulated allele depending on the introduced genetic modification. Because 
expression is measured as the ratio between two individual alleles in every cell, this 
analysis is expected to be largely independent of cell homogeneity and can be applied 
to heterogeneous cell populations. In this respect, the non-targeted SNCA allele allows 
for a simple normalization and serves as internal control across isogenic samples. 
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To analyze allele-specific expression, we developed a robust, sensitive and highly 
quantitative reverse transcription polymerase chain reaction (qRT-PCR) assay based 
on the detection of a heterozygous SNP in the 3'UTR of SNCA. Using CRISPR/Cas9 
genome editing, we generated an allelic series of isogeneic cell lines by either 
heterozygous deletion of the entire microsatellite repeat region (thought to have 
the most pronounced effect on SNCA expression) or insertion of SNCA-Rep1 
elements with all of the repeat length alleles (Rep1-257, Rep1-259, Rep1-261 and 
Rep1-263) that are present in the normal population. Using allele-specific expression 
as readout, we showed that neither the deletion of the microsatellite repeat SNCA- 
Rep! element nor its exchange for the shorter or longer repeat length risk alleles 
affected the cis-regulated expression of the linked SNCA allele, suggesting that this 
element has no clear role in SNCA regulation. This result conflicts with previous 
studies that supported an “enhancer-like” cis-regulatory effect of SVCA-Rep/ on the 
expression of SNCA. It is possible that difficulties in controlling the experimental 
variables of the transgenic mouse (Cronin et al. 2009) or neuroblastoma cell system 
(Chiba-Falek et al. 2005) used in the functional analyses, species-specific differ- 
ences of non-coding regulatory elements or the variability in analyzing human post- 
mortem brain tissue (Fuchs et al. 2008; Dumitriu et al. 2012) affected the validity of 
these conclusions. However, because in vitro differentiated cells allow only for the 
analysis of early events, due to the limited time in culture, we cannot completely 
exclude an effect of the SVCA-Rep!/ element at later time points or only in combi- 
nation with additional environmental factors. 

In contrast to SVCA-Rep1, the CRISPR/Cas9-mediated exchange of Parkinson’s 
disease-associated alleles spanning an enhancer element in the fourth intron that 
carries two risk SNPs (1s356168 and 1s3756054) showed a significant effect on 
allele-specific expression of SNCA (Fig. 1; Soldner et al. 2016). When the protec- 
tive A-allele at SNP rs356168 was exchanged for the risk-associated G-allele, the 
expression of the cis-regulated SNCA allele was increased by 6—18%. In contrast, 
the exchange of the adjacent risk SNP 1rs3756054 showed no effect on allele- 
specific SNCA expression, suggesting that this variant only reaches genome-wide 
significance in GWAS because this variant is in LD with the functional risk- 
modifying SNP (Fig. 1). Given that a 1.5-fold increase in SNCA expression is 
sufficient to cause a familial autosomal-dominant form of Parkinson’s disease, 
these data support the notion that a modest life-long increase of SVCA expression 
may represent the molecular cause of increased risk to develop Parkinson’s disease 
of individuals carrying the G-allele at this risk variant. Moreover, an expression 
quantitative trait loci (eQTL) analysis of SNCA expression in post-mortem adult 
brain samples suggested that a similar sequence-specific modest increase in SNCA 
expression occurs within the human population, further substantiating a functional 
role of the risk variant rs356168 in Parkinson’s disease (Soldner et al. 2016). This 
subtle effect on the expression of a disease-relevant gene is consistent with the 
hypothesis that small effect size of common genetic risk variants contributes to the 
heritability of sporadic diseases. 
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Fig. 1 Proposed model describing the effect of multiple Parkinson’s disease (PD)-associated risk 
variants on SNCA expression (modified from Soldner et al. 2016). The schematic illustrates the 
genomic organization of the SNCA locus, including the PD-associated risk variants SVCA-Rep1 
and the risk SNPs rs356168 and rs3756045, both located in a distal enhancer element in the fourth 
intron of SNCA. The analysis described in Soldner et al. (2016) suggests that the brain-specific 
transcription factors (TF) EMX2 and NKX6-1 show sequence-dependent binding at rs356168 with 
preference for the A-allele. The efficient TF binding in carriers of the protective A-allele results in 
a suppressed distal enhancer element and, consequently, in reduced expression of SNCA associated 
with reduced risk to develop PD. In contrast, the reduced TF binding in carriers of the PD risk- 
associated G-allele at this variant leads to a more active distal enhancer, resulting in increased 
expression of SNCA associated with an increased risk to develop PD. Notably, neither the repeat 
length of SVCA-Rep/ nor the PD-risk variant at rs3756054 significantly affects SVCA expression, 
suggesting that these elements are in linkage disequilibrium (LD) with other functional risk- 
modifying variants 


To gain insight into the molecular basis of how risk variants affect target gene 
expression, we analyzed TF binding data and identified two brain-specific TFs, 
EMX2 and NKX6-1, that bind to the enhancer element at the risk variant. Further 
analysis for sequence-specific binding indicated that both TFs, EMX2 and NKX6-1 
preferentially bind to the protective, lower SNCA expressing A-allele at rs356168 
(Fig. 1). These results suggest a model in which the sequence-dependent binding of 
these TFs at a distal enhancer element represses enhancer activity and thus modu- 
late SNCA expression. Indeed, ectopic overexpression of both TFs in neurons 
reduced SNCA expression (Soldner et al. 2016), consistent with previous data in 
mouse models demonstrating their role as repressors of enhancer function (Ligon 
2003; Schisler et al. 2005; Schaffer et al. 2010; Mariani et al. 2012). Thus, our data 
provide a molecular link between GWAS-identified risk SNP-dependent changes in 
TF binding at a distal enhancer element, altered expression of SNCA and the risk to 
develop sporadic Parkinson’s disease (Fig. 1). EMX2 and NKX6-1 may physically 
interact and function in a complex to suppress enhancer activity. However, expres- 
sion analysis indicated that the two TFs are only expressed in a subset of neurons 
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and are primarily not co-expressed in the same cell, suggesting that they may 
function at the same enhancer element in different cell types. TF-specific usage 
of identical regulatory elements in distinct cell populations might be a possible 
explanation for the selective vulnerability of distinct neuronal populations, as 
observed in Parkinson’s disease. 


Mechanistic Study of Sporadic Diseases: Conclusions 


As outlined in this review, a major challenge of modeling sporadic diseases in the 
culture dish is the system-immanent variability in differentiating hESCs or hiPSCs 
to functional cells. The variability is caused by genetic background differences 
between patient-derived hiPSCs and cells derived from control individuals as well 
as the inconsistency of most protocols to generate homogeneous cultures of differ- 
entiated cells. These issues complicate, if not exclude, the use of gene expression 
level as a valid functional readout to define the molecular mechanisms of candidate 
disease risk variant, which are expected to only subtly alter the transcription of the 
downstream gene. As our analysis of the SNCA-associated risk variants demon- 
strates, two experimental strategies allow us to overcome these limitations: (1) the 
use of CRISPR/Cas9-mediated gene editing for generating disease-relevant and 
control lines that differ exclusively at the risk variant and (2) the development of an 
allele-specific assay that allows the robust detection of small differences in disease 
risk-associated gene expression, an assay that is independent of cell heterogeneity 
and extent of differentiation. 
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Aquatic Model Organisms in Neurosciences: 
The Genome-Editing Revolution 


Jean-Stéphane Joly 


Abstract The use of aquatic model organisms has been greatly diversified in labo- 
ratories. Zebrafish is the most advanced aquatic species for the use of Crispr-Cas9 in 
laboratories. Because of the simplicity and broad applicability of this later system, 
knock-out is now efficiently performed at medium scale. Forward genetics in zebrafish 
can now be performed by CRISPR-based FO screening using high speed and high 
content phenotyping for example by confocal imaging. As zebrafish, marine model 
organisms have the prominent advantage to be transparent, all the more at young 
stages (embryos and larvae) or when fixed samples are cleared by novel methods. The 
Cripsr-Cas9 system is routinely used in the ascidian Ciona intestinalis. It also starts to 
be used in many other marine models, such as the medusa Clythia hemispherica. We 
provide at the end of this review a list of aquatic model species and some examples of 
questions on the origin of our nervous system that can be coped with these models, 
where the possibility to perform genome editing would constitute a major advance. 


Introduction 


With the expansion of biochemistry and molecular biology during the twentieth 
century, researchers focused more and more on a few model organisms such as 
nematodes, fruit flies, or mice. These models were amenable to many experimental 
approaches in molecular biology and biochemistry. Recently, a novel species, the 
zebrafish, has emerged as a major laboratory model. Initially selected because its 
transparent embryo is an excellent system in which to study development, it has 
now become the second most used animal in laboratories worldwide. Thus, the 
current applications of zebrafish studies are now highly diversified in neurobiology, 
immunology, adult physiology, oncology, and regenerative medicine, exploiting its 
advantages for in vivo approaches and imaging. 

In parallel, due to the explosion of sequencing methods, there has been a clear 
trend towards the diversification of model organisms, especially those used in 
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neurosciences. In this paper, we will focus on applications of such new models in 
evolution of development or biomedical research. There are several reasons to use 
an increasing number of model organisms: first, the classical models are not 
representative of most branches of the tree of life, and second, many questions 
now need to be studied in vivo and established models are not always well adapted 
to many of those biological or medical questions. 

We will here elaborate why the use of genome editing in these models offers 
revolutionary perspectives. Future genome-editing experiments should indeed allow 
us to unveil the function of critical genes in almost any species; this approach was 
previously restricted to a few model species, or even restricted to mouse for most precise 
modifications by homologous recombination. With genome editing, it will become 
possible with functional data to study the evolutionary origin of highly diversified cell 
types such as neurons and, in addition, to interrogate how extremely complex cellular 
organizations such as those found in the brain were built in the course of evolution. 


Zebrafish: With the CRiSPR-Cas9 System, Forward Genetic 
Screens Are Back Again 


Zebrafish is a vertebrate, so it has a body plan fundamentally similar to ours (Onai 
et al. 2014): like humans, zebrafish have a notochord that is a central pile of 
turgescent cells that confers rigidity to embryo and larvae and serves as the support 
axis for the development of the spine. Muscles are located on both sides of the 
notochord. The nervous system is found dorsally and intestine ventrally. Zebrafish 
embryos, during the so-called “phylotypic stage” (Slack et al. 1993), use the same 
developmental pattern as human embryos, involving the colinear activation with 
time and space of Hox gene expressions to build axial structures. 

Many organs, including brain, also have similar general organisations in zebrafish 
and humans. Hence, various aspects of neurobiology can be studied in this species. 
Zebrafish has a tripartite brain. Brains, as other organs in fish such as fins or hearts, 
regenerate after a mechanical injury. Some complex behaviours like fear (Amo et al. 
2014), social behaviour (Chou et al. 2016) and memory can be studied in adults. 
Additionally, more basic behaviours can be studied in the 5-day larvae, at a stage 
most amenable for imaging and for which no authorization for animal experimentation 
is required (Naumann et al. 2016). This makes the model easy to use for applications 
such as neurotoxicology performed by academic labs or cosmetology companies. 

At the end of the twentieth century, the zebrafish was suggested to be a promising 
model for genetic screens relevant to human diseases (Mullins et al. 1994). How- 
ever, during the evolution of vertebrates, an additional genome duplication occurred 
in the teleost fish lineage, leading to the presence of many duplicated genes in fish 
genomes that greatly complicate the analysis of screen data. Another pitfall was that 
the zebrafish has maternal factors carried by the egg. So, when a gene is mutated, the 
effect of the mutation is in most cases not visible at early stages because maternal 
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factors are present to insure correct development. Therefore, a disappointingly low 
number of mutations were identified following large-scale screens by random muta- 
genesis in embryos. Because many zebrafish screens were performed during early 
larval stages, most identified mutants failed to exhibit phenotypes similar to human 
rare diseases, which often appear much later in human life. 

After 2000, zebrafish became a very useful model for reverse genetics when 
phenotypical analyses of mutants were performed with spectacular time-lapse 
imaging in live fish, for example, during early development (Olivier et al. 2010), 
neurogenesis (Barbosa et al. 2015), hematopoeisis (Renaud et al. 2011) or immune 
response (Levraud et al. 2014). 

Nowadays, the zebrafish is the most advanced aquatic species for the use of 
Cripsr-Cas9 in laboratories (Shah and Moens 2016). In this species, however, a 
challenge remains: efficient insertions of point mutations (for example, to mimic 
human missense mutations) are still generated at low rates (Renaud et al. 2016). For 
these applications, the very fast early development of the zebrafish is a drawback, as it 
probably makes the repair events following DNA cutting by the CRISPR protein 
highly mosaic and hard to detect in the progeny. Improvements to target the repair 
construct to the nucleus of the one-cell stage embryo will have to be developed. 
Modified oligonucleotides could improve KI rates. Alternatively, plasmidic con- 
structs with long homology arms have been used in a recently published method 
(Hoshijima et al. 2016) to perform KI at large scale in zebrafish; unfortunately, this 
method remains labor intensive. 

Knock-out is now efficiently performed in zebrafish at medium scale (Shah et al. 
2015). Hence so-called forward genetics in zebrafish again seems to have a bright 
future. To study the molecular basis of a given phenotype in a particular cell type, 
large gene families can be targeted for mutations, and, importantly, mutations of 
duplicated genes and of their close paralogs can be performed jointly, due to the 
possibility of injecting arrays of CRISPR guide RNAs. 

For large-scale forward screens, methods of large-scale phenotyping, at first hand 
by 3D imaging, still need to be optimized. Thus, a current priority for zebrafish 
researchers is to improve rapid imaging technologies at large scale and at later stages, 
to make zebrafish a better model. Such a model would provide a perfect context for 
analyzing large collections of mutants. Tissue-clearing methods (Seo et al. 2016) and 
high-speed imaging methods, such as those using highly sensitive video cameras, 
have recently emerged and will certainly be crucial for these approaches. 


Optimizing the Cripsr-Cas9 System in Transparent Marine 
Animals 


Most marine model organisms have the obvious advantages, as zebrafish, to have 
transparent embryos and larvae, a feature selected in water throughout evolution to 
escape predators. Transparency is crucial for the microscopic analysis of development 
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in these models. Eggs can generally be obtained in huge numbers. Although they are 
sometimes quite big because of the presence of vitelline reserves, embryos are com- 
posed of only a few cells and the lineage analysis is thus easy in these simple and 
compact embryos. In ascidian embryos, for example, the notochord only has 64 cells. 
Significant progress in understanding human cardiac developmental gene network was 
made in ascidian models. This unique insight provided direction for the reprograming 
of cell lineages in human cell cultures: following the observation that Ci-es1/2 and 
Ci-mesp generated cardiac progenitors in ascidians, researchers transdifferentiated 
human dermal fibroblasts into cardiac progenitors (Islas et al. 2012). 

The Cripsr-Cas9 system was used in a study using the ascidian Ciona intestinalis 
(Stolfi et al. 2014). This study, from Lionel Christiaen’s group, reported the success 
of tissue-specific genome editing in this species. Optimization of plasmid constructs 
was performed, in which specific ubiquitous U6 promoters were used to drive guide 
RNA expression, and tissue-specific promoters were designed to drive the expres- 
sion of the Cas protein. Introducing the CRISPR—Cas9 components in ascidians 
was quite easy because a large number of eggs could be electroporated with plasmid 
DNA, producing both the CRISPR protein and the guide RNAs. Nevertheless, 
while breeding of this species in inland laboratories has been performed (Joly 
et al. 2007), it remains difficult. Improvements are still required to reliably obtain 
the culture of stable lines of transgenic animals. 

The Cripsr-Cas9 system is beginning to be used in many other aquatic model 
organisms (for a fascinating example in lampreys, Square et al. 2015). For example, 
fascinating experiments (unpublished) have been performed in the medusa Clytia hemi- 
sphaerica (Tsuyioshi Momose, CNRS Villefranche-sur-Mer, personal communi- 
cation). This model recently emerged as a remarkable cnidarian species useful for 
evo-devo studies (Houliston et al. 2010). Experiments were in part supported by the 
French network named “Etude Fonctionnelle sur les Organismes Modèles (EFOR, 
www.efor.net)”, promoting research—including genome editing and imaging—in 
metazoan model organisms. 

Success in Clythia is due to obtaining full life cycles in laboratory aquaria, gener- 
ating quasi-immortal, vegetatively growing colonies. Moreover, adult medusae spawn 
daily, generating transparent and easy to inject eggs. Rates of Cripsr-Cas9 knock-outs 
are strikingly high: over 700 embryos, all with potential knock-outs as seen by the loss 
of fluorescent protein activity, can be generated in a single injection experiment. 

Injection of the NLS Cast protein/sgRNA into unfertilized eggs can be 
performed as soon as | hour after ovulation and before subsequent fertilization. 
In this condition, the Cas9 protein probably has time to be targeted to its cutting site 
before the first division of the embryo occurs. One first exciting application of this 
method was the deletion of green fluorescent proteins, making newly generated 
transgenic lines suitable for imaging applications. Indeed, in these species, endo- 
genous fluorescent activity hinders potential observation of GFP in newly generated 
transgenic animals. The availability of mutants in such species will offer novel 
routes for fundamental research in evolution and development (Galliot et al. 2009). 
In such non-marine species, a challenge remains to keep animals alive in captivity 
long term. Significant investments will also be needed to obtain colonies of 
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inbred lines, which should become reference laboratory lines for the corresponding 
communities of worldwide researchers. 


More and More Aquatic Model Organisms 
for Diversified Uses 


Examples of emerging aquatic models of increasing evolutionary distances from verte- 
brates are described below. These models are located at several key nodes of the meta- 
zoan branches of the tree of life. Alternative fish models to zebrafish (Schartl 2014) 
allow the study of particular evolutionary processes, such as adaptation to cave life in 
the Astyanax mexicanus. To study the emergence of synapomorphic (specific and 
shared) vertebrate features in evolution, lampreys (including the sea lamprey, Petro- 
myzon marinus, an agnathan), lancelets (the cephalochordate amphioxus, Branchio- 
stoma lanceolatum) or tunicates (such as the urochordate Ciona intestinalis) are very 
relevant models. Other more distantly related bilaterians such as the polychaete annelid 
Platynereis dumerilii provide insight into which features were already present in the 
common ancestor of all bilaterian organisms, the so-called “urbilateria.” Even more 
distant metazoans, with no bilateral symmetry but rather radial symmetry, are also used 
in laboratories. Thus, Ctenophores (sea gooseberries) and Cnidarians (corals, jellyfish, 
sea anemones) can be used to study ancient features of nerve cell types. Also, sponges 
and placozoans constitute fascinating basal Metazoans, with no nervous system. 


In Biomedical Research, Why and How Should We Use 
Aquatic Models to Study Diseases of the Nervous System? 


A first obvious use of model organisms is to generate so-called “models of diseases.” 
In most cases, these models are mutants or transgenic animals that reproduce at best 
pathological conditions observed in humans, such as neural degeneration. With the 
advent of precise genome-editing methods, the capacity to generate point mutations 
in any model by introducing a repair construct bearing the mutations will constitute a 
true revolution. Indeed, mutations at orthologous positions to variations found in 
human diseases can be generated in these model animals if genomes can be aligned 
in the region surrounding the mutation. Then the phenotypical effects of the abnor- 
mal protein function can be described, for example, using live imaging to observe 
abnormal cell behaviors, such as proliferation or migration. Therein resides the 
great advantage of these aquatic models, with a diversity of developmental and 
genomic contexts and transparency allowing easy imaging. 

Moreover, in an applied perspective in regenerative medicine, understanding how 
cell types emerged during evolution helps us identify crucial pathways that are, for 
example, active in stem cells in normal and pathological conditions. The Crispr-Cas9 
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system applied to aquatic model organisms will offer an unprecedented opportunity 
to characterize the key genes and pathways that are active following injury or 
degeneration. They promote regeneration responses in animals with regenerative 
capacities and could be useful in regenerative medicine, if (re)activated in humans 
(Karra and Poss 2017). 


A Short Natural History of the Nervous System: Several 
Questions on Its Origin 


This chapter provides examples of what marine model organisms bring us in the 
domain of neuroscience. Many essential questions can indeed be examined with 
these models, providing us more knowledge about how the human brain was shaped 
through evolution, and this should help us better understand pathologies and their 
pleiotropic effects. 

Evolution indeed sifts through the noise and allows us to focus on key genes and 
pathways that have remained crucial for specific cell types throughout evolution. 
Looking at extant species, and describing common features that are likely to be 
ancestral and shared, is also a way to “reconstruct” the nervous system of the 
putative last common ancestor between the two compared species. In this domain, 
the absence of fossils of nervous system and brains has impeded researchers. 

According to Detlev Arendt (Arendt 2008), homology hypotheses are based on 
the comparison of genes, cytological features and ontological location in the body 
of the embryo. Functional experiments with genome editing in model organisms 
will add extremely important cues to this domain. 

The Ctenophores are colorful planctonic animals that have a sophisticated 
nervous net allowing them to swim and to emit beautiful waves of fluorescent 
flashes. A long-standing debate is whether these animals have neuron-like cells, 
which would have appeared independently of the neurons of our nervous system 
during evolution. In this respect, examining the phylogenetic position of these 
animals is primordial: are they closely related to bilaterians or rahter do they 
form an out-group of metazoans? If they are more distantly related to us than 
sponges, this would indeed mean that the nervous system was invented twice in 
evolution, because it is very unlikely that sponges, which would be closer relatives 
to bilaterians, secondarily lost neural cells. 

Until this controversy is resolved, it will not be possible to know whether there 
were two independent origins of the nervous system in animals, which would, of 
course, be a very exciting possibility. However, as argued in a recent review (Jager 
and Manuel 2016), many lines of evidence now suggest that ctenophores are closely 
related to bilaterians and that neurons appeared only once in evolution. In favour of 
a single nervous system type is the presence of the well-known neurogenesis SoxB 
gene and the presence of acetylcholine and numerous common GPCRs. In any case, 
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ctenophores should provide key insights into deeply conserved features of 
animal neural cells. 

Marine organisms also allow us to examine how the nervous system became 
condensed and centralized (Arendt et al. 2016) while becoming much more com- 
plex in the course of evolution to terrestrial life (Nomaksteinsky et al. 2009). 
Starting from a diffuse nerve net in a swimming larva, brains became huge, formed 
from the so-called embryonic neural plate in vertebrates later undergoing neurula- 
tion to form the neural tube. 

Another question about the origin of the nervous system is how the vertebrate 
tripartite nervous system arose in the course of evolution. The vertebrate central 
nervous system is composed of anterior brain, posterior brain and spinal cord. Such 
organisation occurred with the emergence of borders between brain domains, such 
as the midbrain/hindbrain boundary. Recently, following studies in an annelid 
worm, Arendt and colleagues proposed that the nervous system of the common 
bilaterian ancestor was probably composed of two independent domains 
(corresponding to the vertebrate forebrain/midbrain and hindbrain) that later 
fused during evolution (Tosches and Arendt 2013). 

Very recently, and in line with Arendt’s hypothesis, Chris Lowe’s group pro- 
posed that the adult body plan of an indirect developing hemichordate develops by 
adding a Hox pattern trunk to an anterior larval territory, confirming the hypothesis 
that marine larvae are “swimming heads” (Gonzalez et al. 2017). 

Concerning the origin of vertebrate synapomorphic characters, ascidians provide 
evidence of the precraniate or prevertebrate origins of the neural crest. Neural crest 
cells are stem cells with migratory behaviors and the capacity to differentiate into 
incredibly diversified tissues and locations. The evolutionary origin of neural crest 
is obscure. They were first thought to be a vertebrate sinapomorphy. Bill Jeffery 
identified migratory pigment cells in ascidians (Jeffery et al. 2004; Jeffery 2006) 
and, more recently, Lionel Christiaen’s group identified migrating neuron precur- 
sors in the ascidian Ciona intestinalis. Interestingly, these precursors were shown to 
arise from the border of the neural plate, a hallmark of the neural crest in vertebrates 
(Stolfi et al. 2015). 


Conclusion 


Cripsr-Cas9 system and other genome-editing techniques can now be used in 
various aquatic model organisms for extremely diversified applications, which 
will lead to new models of brain diseases for biomedical research. Also, these 
models will provide strategies to characterize human genetic variations linked to 
diseases at large scale and suggest new avenues for regenerative medicine, because 
of the exceptional abilities of most aquatic models to regenerate. Basic biological 
knowledge will, of course, benefit from this revolution, and one might expect 
many more revolutionary discoveries from the exploration of the genomes of 
multiple animal aquatic species using Cripsr-Cas9 approaches. 
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Genome-Wide Genetic Screening 
in the Mammalian CNS 


Mary H. Wertz and Myriam Heiman 


Abstract Genes linked to major neurodegenerative diseases, including 
Alzheimer’s, Parkinson’s, and Huntington’s diseases, were first identified over 
15 years ago, but neither a full molecular explanation for the cell loss seen in 
human patients nor a curative therapy has yet been achieved for any of these 
diseases. In most model organisms, when new hypotheses are needed to explain a 
cellular process, genetic screens are the tool of choice. For example, ‘synthetic 
lethal’ screens can lead to the identification of genes that enhance the toxicity of a 
particular mutation, revealing pathways critical for surviving the mutation’s effects. 
To date, however, genome-wide unbiased screens are not feasible in mammalian 
central nervous system neurons except in vitro, which fails to capture the relevant 
disease pathologies, and no genome-wide screens have yet been conducted in the 
mammalian central nervous system. We outline in this short monograph the steps 
needed to implement a methodology that allows for genome-wide genetic screening 
in the central nervous system of mice to study both normal and degenerative disease 
gene function. 


Introduction 


Genome-wide genetic screens have been used for decades in S. cerevisiae, 
C. elegans, and D. melanogaster to elucidate many important aspects of cell 
biology. Such traditional mutagenesis-based genome-wide genetic screens have 
been impossible to routinely perform in mice due to the prohibitively large number 
of mice that would be needed. However, the ability to perform such screens in the 
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nervous system would enable the generation of new hypotheses regarding the 
molecular mechanisms of disease. For example, unbiased genome-wide genetic 
screens could reveal genes that are involved in the toxicity of disease-associated 
mutations, such as mutations in the huntingtin gene that are found in human 
Huntington’s disease patients. Such neurodegenerative disease-focused genetic 
screens have been attempted in S. cerevisiae, C. elegans, and D. melanogaster, 
but these screens by definition fail to capture the full complexity of mammalian 
neurons—an important point, given the widely varying susceptibility seen amongst 
cell types in neurodegenerative diseases. Alternatively, genome-wide genetic 
screens that utilize mammalian neuron-like cells have been conducted in vitro, 
but these screens are also unable to recapitulate the many aspects of in vivo neurons 
in the mammalian central nervous system (CNS). The in vivo context may be 
essential to many aspects of CNS biology, given for example the diversity of 
CNS cell types, the likely importance of both cell autonomous and non-cell 
autonomous factors in neurodegenerative diseases, and the known age dependency 
of most neurodegenerative diseases. Ideally, these screens would be done in 
mammalian neurons in their native cellular environment. 

To bypass the difficulties associated with classical mutagenesis screening as well 
as the diploid nature of mammalian genomes, genome-wide short hairpin (shRNA) 
and clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR- 
associated protein 9 (Cas9) screening approaches have been applied to mammalian 
cells in vitro (e.g., among many others, Moffat et al. 2006; Root et al. 2006; Shalem 
et al. 2014; Wang et al. 2014; Zhou et al. 2014). Despite the power of these 
methodologies, there are many challenges to their application in vivo, especially 
in the CNS. Indeed, mammalian genome-wide shRNA or CRISPR genetic screens 
have been conducted mainly either in vitro, in transformed cell lines, or else in 
primary cells manipulated ex vivo and then returned in vivo (Chen et al. 2015; 
Graham and Root 2015). Based on the insights that have come from such studies, 
genome-wide genetic screening could be a powerful tool for the study of normal 
cellular function and degenerative disease processes in the mammalian CNS, 
provided that such screens are performed in the context of models that recapitulate 
the relevant biology. For this reason, we recently developed a genetic screening 
workflow that allows rapid, high-sensitivity screening in the mouse CNS for aging 
and neurodegenerative disease processes (Shema et al. 2015). This workflow 
combines the use of (1) pooled lentiviral shRNA libraries; (2) stereotaxic injection 
of these pools into mouse models of neurodegenerative disease and wild-type 
littermates; (3) incubation of injected libraries, such that shRNAs that enhance 
neurodegenerative disease gene toxicity lead to cell death; and (4) sequencing and 
analysis of the remaining shRNAs elements in all surviving cells in order to 
determine which constructs have enhanced cell death and thus ‘drop out’ of library 
representation (Fig. 1). 

For our genetic screening workflow, we first used shRNA viral libraries, since 
genome-wide shRNA libraries for the mouse genome are available and have been 
successfully utilized in many studies. In our pilot screen, we chose to target genes 
that enhanced the lethality of a fragment of the mutant huntingtin gene. 
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Fig. 1 Genome-wide genetic screening in the mammalian CNS. Pooled viral libraries containing 
shRNAs, gRNAs, or cDNAs are first concentrated via ultracentrifugation to a high titer suitable for 
bilateral injection into the striatum (or other CNS target area) for in vivo transduction. After 
injection, viral payloads are allowed to integrate into the host cell genome and express for several 
weeks. During this time, genetic perturbations that enhance toxicity in a disease model context 
may enhance cell death. The targeted tissue is then carefully dissected and the genomic DNA is 
extracted. After PCR and sequencing of library elements, deconvolution and analysis reveals the 
library representation. Those genes that enhance cell death in vivo will be depleted or lost from the 
library (red arrows) in the mutant as compared to control animals and thus can be identified as 
potential modifiers of neuronal toxicity (orange barcode). These genes can then be confirmed in 
follow-up validation experiments 


Huntington’s disease is the most common inherited neurodegenerative disorder, but 
the molecular pathways that are essential for mutant Huntingtin protein’s toxicity 
in vivo are not fully understood. Huntington’s disease is particularly amenable to 
genetic screening, as it is a monogenic disease for which several mouse models 
exist (Huntington’s Disease Collaborative Research Group 1993; Mangiarini et al. 
1996), and the most greatly affected brain region (caudate-putamen/striatum) is a 
well-delineated sub-cortical structure. Since Huntington’s disease displays an aging 
component (Mattson and Magnus 2006), we first chose to target a set of genes that 
showed altered expression both in the context of normal aging and in mutant 
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Huntingtin expression in CNS neurons. Of these genes, we identified one, Gpx6, 
that enhances the toxicity of mutant Huntingtin protein when its expression is 
reduced and that partially reverses Huntington’s disease-like symptomatology 
when overexpressed in mouse striatum (Shema et al. 2015). With this proof-of- 
principle study complete, we outline below parameters that will be essential to 
extend this methodology to perform genome-wide screening in the 
mammalian CNS. 


Genome-Wide Viral Library Preparation and Delivery 


Stable and long-term transduction of post-mitotic neurons by lentivirus has been in 
use for over 20 years (Naldini et al. 1996a, b). The available genome-wide shRNA 
or CRISPR guide RNA (gRNA) viral libraries described to date are typically 
packaged with a vesicular stomatitis virus-G (VSV-G) envelope due to resulting 
high stability and wide host cell range of the virus (Moffat et al. 2006; Root et al. 
2006; Shalem et al. 2014; Wang et al. 2014; Zhou et al. 2014). VSV-G 
pseudotyping additionally enhances the neuronal tropism of lentivirus (Burns 
et al. 1993; Yee et al. 1994). Concentration of the initially obtained viral superna- 
tants by ultracentrifugation yields high titers of intact VSV-G pseudotyped virus 
(Burns et al. 1993; Yee et al. 1994) that are essential for in vivo stereotaxic 
injections into the brain. As lentivirus is a relatively large virus (~100 nm), its 
diffusion is limited in the dense neuropil of the mammalian CNS. Given this 
consideration, injection parameters must be carefully optimized for each target 
tissue region (Cetin et al. 2006). Adeno-assisted virus (AAV) represents another 
potential delivery vehicle for pooled screens. As AAV is a small (~20 nm) 
non-enveloped virus that can be concentrated to very high titers, it is ideal for 
in vivo CNS delivery and, for this reason, AAV vectors have been widely used in 
human gene therapy clinical trials (Hocquemiller et al. 2016). Drawbacks to using 
AAV include its limited payload size (~4.5 kb), which limits the ability to perform 
cDNA overexpression screens, and the fact that the AAV serotype to be used may 
need to be optimized for the CNS cell type of interest. 

The choice of viral library payload will depend on the experimental goals of the 
screening project but, in principle, cDNA, shRNA, or CRISPR gRNA libraries 
could all be used to interrogate CNS gene function. A recent study that compared 
the results of both shRNA and CRISPR/Cas9 gRNA screens to identify essential 
genes in a leukemia cell line found modest correlation between screen results 
(Morgens et al. 2016), and in some biological contexts it may be that both 
knockdown (shRNA or CRISPRi; Qi et al. 2013) and knockout (CRISPR) strategies 
should be employed to examine disease-relevant mechanisms (Deans et al. 2016). 

Once a viral library is chosen and prepared, the number of cells needed for 
genome-wide screening should be estimated to determine the feasibility of 
conducting screening in the desired CNS cell population. Based on past shRNA 
and CRISPR gRNA screens, approximately 1000 cells should be targeted per 
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library element, depending on the details of the screen. This number is necessary to 
average out noise in the assay itself, and also heterogeneity in the genetic pertur- 
bation induced in each cell, as well as inherent variability in the response of the 
screened cells to the perturbation. (Graham and Root 2015). Thus, for a CRISPR 
gRNA library that contains approximately four gRNAs per protein-coding gene, the 
80,000 library elements should each be targeted to approximately 1000 cells (thus 
80 million cells in total across all replicates). Reducing either biological or techni- 
cal variability, for example by employing a more homogeneous cell population, can 
reduce the number of cells needed in each screen. The time between injection of the 
library and harvesting of the cells for analysis will be determined by experimental 
goals and could range from several days to months, depending on the rate of 
progression of the CNS phenotype being screened. 


Interpretation of Results 


As in other pooled RNAi/CRISPR screens, in CNS genome-wide screens genomic 
DNA is extracted from the target tissue and subjected to PCR for constant regions in 
the shRNA/gRNA sequences. The samples are then barcoded, pooled, sequenced, 
and run through deconvolution analysis to determine the representation of each 
individual library element. A few key factors that determine the quality and the 
interpretation of the results are the number of elements targeting each individual 
gene, if it is shRNA, gRNA or cDNA, and the depth of sequencing. A number of 
different methods and tools have been designed to analyze pooled screening data, 
and these differ based on library complexity and the type of element used to induce 
the perturbation. There are also a number of analytical tools that have been 
developed for analysis of RNAi and CRISPR genome-wide screens to assign 
enrichment/depletion scores, including, for example, Model-based Analysis of 
Genome-wide CRISPR/Cas9 Knockout (MAGeCK), RNAi Enrichment Gene 
Ranking (RIGER), and STARS, which rank shRNA or gRNA performance based 
on magnitude and consistency of elements for each gene that is depleted or enriched 
(Luo et al. 2008; Li et al. 2014; Doench et al. 2016). Another tool, Cas9 high- 
Throughput maximum Likelihood Estimator (casTLE), can be used to combine 
data of shRNA and gRNA screens to increase sensitivity (Morgens et al. 2016). 
A primary genome-wide in vivo screen may yield hundreds of hits, and inde- 
pendent validation of these targets is necessary to confirm the assay results and the 
gene specificity of the observed effects and to understand the role of the genes in 
modifying disease phenotypes (Fig. 2). Two strategies for validation of genome- 
wide in vivo screening can be utilized to assess performance of the primary screen 
and confirm hits. Creation of sub-pool libraries allows efficient validation of several 
hundred potential hits. This strategy has been used to validate findings in vitro and 
in cells reintroduced in vivo (Chen et al. 2015). Sub-pool elements could include 
shRNAs or gRNAs that target genes that were unchanged in the primary screen, an 
additional 4—5 shRNAs or gRNAs for the primary screen hit genes, and carefully 
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Fig. 2. Validation of in vivo screening hits. A primary genome-wide in vivo screen is completed 
with at least 4-6 elements targeting a single gene, leading to libraries composed of 
~80,000-120,000 elements. Validation of genes identified in the primary screen can be completed 
with smaller sub-pool libraries of only ~10,000—20,000 elements, which must be carefully 
designed to include an increased number of unique elements (~10) targeting the positive hits 
identified in the genome-wide screen as well as appropriate controls. These controls come in the 
form of elements targeting non-genomic sequences, genes unchanged in the primary screen, and 
C911 controls that can reveal seed-related off-target effects of hits. Sub-pool validation using a 
combination of multiple modalities (i.e., cDNA, gRNA and shRNA) may also be used to increase 
confidence in hits. Additional validation at the single-gene level can then be performed via viral 
transduction of two to three targeting elements and appropriate controls or else traditional 
knockdown/knockout/overexpression studies. Such single-gene validation is particularly impor- 
tant for investigation of behavioral and pathogenic readouts of disease processes as well as 
biochemical mechanisms underlying modification of toxicity 


designed 9C11 controls that reveal shRNA seed-related off-target effects (Buehler 
et al. 2012). A second approach to validation is by interrogation of individual hits 
via traditional single-gene knockout/knockdown/overexpression studies. To do 
this, in addition to classical germline genetic perturbations, CNS viral delivery of 
top screen-hit validated shRNAs/gRNAs/cDNAs by stereotaxic injection can be 
used to rapidly introduce a single genetic perturbation, as is routinely performed in 
many CNS studies with AAV or retroviral vectors. This type of more traditional 
validation approach has the advantage that it can be used to assay various behav- 
ioral and pathological readouts of disease progression and to tease out specific 
biochemical pathways. 

In addition to validation of targets from a single primary screen utilizing a 
particular genetic perturbation, comparison of data from two different modalities, 
i.e., both shRNA knockdown and gRNA knockout, or cDNA overexpression and 
gRNA knockout, may be beneficial. This cross-platform approach has shown to 
produce varying degrees of overlap in identified targets (Deans et al. 2016; Evers 
et al. 2016; Morgens et al. 2016), highlighting the possible utility of applying 
several types of perturbations in a multi-armed screen to enhance the specificity 
of hits or else to expand the type of hits that can be obtained (e.g., certain 
phenotypes may only be revealed upon gene knockdown, not knockout). While 
primary genome-wide cDNA screens may be challenging due to the efficiency of 
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packaging genome-wide cDNAs into viral vectors, the potential for use in sub-pool 
screening of a smaller number of genes is much higher. Therefore, a combination of 
these techniques (CDNA overexpression, shRNA knockdown, gRNA CRISPR or 
CRISPRi) may yield increased sensitivity to uncover biological pathways relevant 
to neuronal function and dysfunction. 


Future Directions 


Looking forward, the ability to perform cell type-specific genome-wide genetic 
screens will be helpful to fully understand CNS disease mechanisms, as most 
neurological diseases display cell type-specific patterns of vulnerability, including 
the two most prevalent neurodegenerative diseases, Alzheimer’s disease and 
Parkinson’s disease (Mattson and Magnus 2006). The use of a conditional 
Cas9-expressing mouse line crossed to one that expresses Cre recombinase in 
the cell type of interest should allow such cell type-specific CRISPR knockout or 
CRISPRi gRNA screens. Conditional or inducible systems for use with mamma- 
lian retroviral vectors (Beier et al. 2011) could be useful for lentiviral-based 
shRNA or cDNA overexpression screens. Genome-wide genetic screening in the 
mammalian CNS may make it possible to interrogate molecular mechanisms 
linked to all the major neurodegenerative diseases and eventually to identify 
common vulnerability factors that may exist among these diseases, for example, 
aging-related and proteostasis pathways. Finally, the ability to perform genetic 
screening in the CNS around a non-death phenotype (e.g., biomarker expression 
using flow-sorting to isolate the hit cells) would greatly expand the power of 
genome-wide approaches. 
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Abstract The zebrafish (Danio rerio) has emerged in recent years as a powerful 
vertebrate model to study neuronal circuit development and function, thanks to its 
relatively small size, rapid external development and translucency. These features 
allow the easy application of in vivo microscopy analysis and optical perturbation 
of neuronal function. So far, genetic manipulation in zebrafish has been limited to 
the generation of constitutive loss-of-function alleles and transgenic models. 
CRISPR/Cas9 offers unprecedented possibilities for genomic manipulation that 
can be exploited to study neuronal function. In the past few years, we have 
successfully used CRISPR/Cas9-based technology in zebrafish to achieve two 
goals crucial for neuronal circuit analysis by developing two CRISPR/Cas9-based 
approaches that overcome previous major limitations to the study of gene and 
neuron functions in zebrafish. The study of gene function via tissue- or cell-specific 
mutagenesis remains challenging in zebrafish when the study of the function of 
certain loci might require tight spatiotemporal control of gene inactivation, which is 
particularly true in studying the function of a particular gene in post mitotic 
neurons, when the same gene may have had an earlier developmental function. 
To circumvent this limitation, we developed a simple and versatile protocol to 
achieve tissue-specific and temporally controlled gene disruption based on Cas9 
expression under the control of the Gal4/UAS binary system (Di Donato et al. 
2016). This strategy allows us to induce somatic mutations in genetically labeled 
cell clones or single cells and to follow them in vivo via reporter gene expression. 
We have also been able to target endogenous genomic loci to specifically label the 
great variety of neuronal cell types with reporter genes such as the transcriptional 
activator Gal4 (Auer et al. 2014). As a result, we can specifically target the 
expression of fluorescent proteins, a genetically encoded calcium indicator or 
optogenetic actuators in defined neuronal subpopulations. 

We will present ways that these two methods can be applied to the study of the 
development of the nervous system in larval zebrafish. 
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CRISPR/Cas9 and Gal4/UAS Combination for Cell-Specific 
Gene Inactivation 


Over the last decades, the analysis of gene function has relied on mutagenesis 
approaches leading to the generation of loss-of-function alleles. The CRISPR/Cas9 
system represents a major step forward towards achieving precise and targeted gene 
disruption. Being readily applicable for the creation of knockout loci in a great 
variety of animal models used in neuroscience studies, this technology has led to 
significant advances in the fields of developmental and functional neurobiology 
(Heidenreich and Zhang 2016). Nonetheless, constitutive gene disruption is often 
associated with side effects, such as compensation mechanisms and embryonic 
lethality, representing an important limitation on the analysis of phenotypes specific 
to the nervous system, since neural circuits are fully established at late stages of 
development. Recently, studies in worms (Shen et al. 2014), fruit flies (Port et al. 
2014), mice (Platt et al. 2014) and zebrafish (Ablain et al. 2015) have pioneered the 
use of the CRISPR/Cas9 methodology to generate conditional gene knockouts via 
tissue-specific expression of cas9. This strategy takes advantage of cell type- 
specific promoters to control the spatiotemporal expression of the Cas9 enzyme. 
Importantly, one of the most common methodologies ensuring cell-specific expres- 
sion of transgenes in zebrafish is the Gal4-UAS binary system (derived from yeast), 
in which the transcription of genes placed 3’ of an upstream activating sequence 
(UAS) relies on the DNA binding of the Gal4 transcriptional activator (Asakawa 
and Kawakami 2008). Gene- and enhancer-trap methods have been applied to 
establish a significant number of Gal4 transgenic lines (Davison et al. 2007; 
Asakawa et al. 2008; Scott and Baier 2009; Kawakami et al. 2010; Balciuniene 
et al. 2013), several of which are neural-specific (Scott et al. 2007; Asakawa et al. 
2008). Notably, in these lines the Gal4 open reading frame (ORF) is randomly 
integrated in the fish genome through Tol2-based transposition, and the insertion 
site is not mapped; therefore, the sequence of the promoter elements driving Gal4 
expression is unknown. In our work, we have developed a flexible conditional 
knockout strategy based on the CRISPR/Cas9 technology that combines Gal4/ 
UAS-mediated expression of the Cas9 enzyme with a constitutive expression of 
sgRNAs driven by Poll! U6 promoter sequences. Our strategy does not require 
previous knowledge of promoter sequences to induce cas9 expression since this is 
provided by cell type-specific Gal4 transcription. Additionally, to enable the anal- 
ysis of the phenotypes arising from Cas9-induced gene disruption, we marked the 
population of the cas9-expressing cells by using the viral T2A self-cleaving peptide 
(Provost et al. 2007), ensuring the stoichiometric synthesis of the Cas9 enzyme and 
the fluorescent reporter GFP from the same mRNA. To test our conditional knock- 
out strategy, we used our vector system to target the tyrosinase (tyr) locus, coding 
for a key enzyme involved in melanin production (Camp and Lardelli 2001). We 
were able to induce eye-specific loss of pigmentation by expressing our transgene 
exclusively in the progenitors of the neural retina and the retinal-pigmented 
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epithelium (RPE). For this purpose we used a transgenic line, Tg(rx2:gal4), in 
which the Gal4 trans-activator is specifically driven in the optic primordium by the 
promoter of the zebrafish retinal homeobox gene 2 (rx2; Heermann et al. 2015). 
This result confirmed the ability of our strategy to induce Gal4- and Cas9-mediated 
tissue-specific gene inactivation. Remarkably, in this first approach, GFP expres- 
sion was strictly dependent on the temporal activity of the promoter driving Gal4 
expression, thus restricting direct detection of potential mutant cells to a limited 
time window. This caveat reduces the possibility of analyzing loss-of-function 
phenotypes after Gal4 transactivation activity has terminated. To circumvent this 
issue, we proposed to use the activity of the Cre enzyme, a topoisomerase that 
catalyzes the site-specific recombination of DNA between /oxP sites (Branda and 
Dymecki 2004; Pan et al. 2005), to constitutively label the population of Cas9- 
expressing cells. We therefore developed a construct where we substituted the GFP 
with a Cre reporter, enabling the analysis of gene disruption after Cas9 activity has 
terminated. The visualization of cre-expressing cells is commonly achieved with 
the use of transgenic lines carrying a cassette where a constitutive promoter drives 
the expression of a fluorescent reporter upon the Cre-mediated excision of a floxed 
stop codon. Thus, in cells carrying floxed alleles, the concomitant expression of 
Cas9 and Cre enzymes by a tissue-specific Gal4 promoter would ensure, respec- 
tively, double-strand breaks (DSBs) at the targeted locus as well as the recombi- 
nation of the floxed locus. Notably, if the Cre-dependent expression of a reporter is 
constitutive after recombination, all the cells deriving from a cas9-expressing 
progenitor will be fluorescent, allowing long-term visualization of potentially 
mutated clones of cells. By using our system in retinal stem cells, we successfully 
disrupted the atoh7 gene, which is involved in the specification of retinal ganglion 
cells (RGC) in the developing retina. In this case, we could modify cell fate 
determination of retinal progenitor cells and generate labeled loss-of-function 
clones lacking the population of RGC. 

Additionally, we employed our method to create genetic chimeras in which single 
mutant cells could be differentially tagged in a wild-type tissue. To obtain this 
labeling, we combined the 2C-Cas9 system with the Brainbow technology. The Tg 
(UAS :brainbow) line (Robles et al. 2013) carries a transgene in which the CDSs of 
the fluorescent proteins tdTomato, Cerulean and YFP are separated by Cre 
recombinase sites. In double transgenic embryos Tg(UAS:brainbow) x Tg(Tissue- 
specific promoter:gal4), tdTomato will be expressed in the Gal4 transactivation 
domain in the absence of Cre-mediated recombination. In contrast, cerulean or YFP 
will be transcribed if Cre recombinase is active. The expression of our transgenesis 
vector in these embryos provides simultaneous activity of the Cas9 and Cre 
enzymes. As a result, all the Gal4-positive cells that received the plasmid are 
potentially mutant and marked by cerulean or YFP fluorescence, whereas the 
population of Gal4-positive cells that do not express the construct is wild-type and 
labeled with the reporter tdTomato. This multicolor labeling strategy can be easily 
applied to neurobiology studies to induce targeted mutations in single neurons and 
directly compare loss-of-function and wild-type phenotypes in the same animal. To 
test this potential application, we targeted the genomic locus coding for the motor 
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protein Kinesin family member 5A, a (AifSaa) (Campbell and Marlow 2013; Auer 
et al. 2015), whose inactivation triggers the reduction of RGC axon arbor complexity 
via a cell-autonomous mechanism (Auer et al. 2015). To target the kifSaa gene with 
the 2C-Cas9 system in single RGC, we used the Tg(is/2b:gal4) line. As expected, 
after injection of our construct into one-cell stage embryos derived from a cross of 
Tg(isl2b:gal4) and Tg(UAS:brainbow) fish, we could observe a strong decrease in 
total branch length in YFP- or Cerulean-expressing RGC (potentially kifSaa mutant) 
compared to tdTomato-fluorescent RGC (wild-type). 

In conclusion, the 2C-Cas9 system represents a versatile tool to induce biallelic 
conditional gene inactivation. The use of the Gal4/UAS system allows the targeting 
of a gene of choice in any cell population. The combination of this bipartite system 
with simultaneous activation of Cas9 And Cre enzymes in progenitor or differen- 
tiated cells enables first, the genetic lineage tracing of mutant cells and second, the 
detection of cell-autonomous gene inactivation at single cell resolution. Addition- 
ally, permanent labeling of knockout cells offers the possibility of investigating 
gene function in adult animals, expanding the applicability of the 2C-Cas9 from 
neurodevelopment to maintenance and function of neural networks. Finally, 
because the 2C-Cas9 system is based on genetic tools available in several model 
organisms, this approach allows the same level of investigation in a broad range of 
animal models. 

In addition to the use of the Crispr/Cas9 application for the generation of loss-of- 
function alleles, RNA guide nucleases can be used for more sophisticated genome 
modifications such as homologous recombination (HR) or non-homologous end 
joining (NHEJ)-mediated knockin. We herein provide a conceptual outline of the 
steps involved in the generation of knockin lines based on the Crispr/Cas9 strategy 
and the latest advances made in the zebrafish genome-editing field. 


Crispr/Cas9-Mediated Knockin Approaches in Zebrafish 


With its advantage of transparency, the zebrafish model organism rapidly emerged 
as a powerful experimental system for studies in genetics, developmental biology 
and neurobiology. The possible integration of exogenous genes into any given loci 
and the analysis of their function in the living animal have dramatically improved 
over the past few years with the development of genome editing technologies. Prior 
to this recent explosion in the field of knockin generation, conventional transgenic 
zebrafish lines were generated by Tol2-mediated transgenesis, which has success- 
fully allowed the making of hundreds of new reporter lines essential to the study of 
particular gene functions in vivo (Davison et al. 2007; Asakawa et al. 2008; Scott 
and Baier 2009; Kawakami et al. 2010; Balciuniene et al. 2013). Bacterial artificial 
chromosome-based transgenesis has been and still is one of the go-to methods for 
making reporter lines. However, this technique comes with one major limitation: 
the integration of extra coding copies of hundreds of kbs. In addition, it is not 
known how the integration of such a large construct affects the neighboring site of 
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insertion. More recently, the transcription activator-like effectors (TALEs) tech- 
nology, a milestone in the development of zebrafish mutant and transgenic lines, 
has lifted the limit of loci-specific targeting. With very low off-targeting effects, 
TALEs were therefore the first successful genome editing method that permitted 
homologous-directed recombination (HDR) and NHEJ-mediated knockin in 
zebrafish (Bedell et al. 2012; Zu et al. 2013). Two reports (Chang et al. 2013; 
Hwang et al. 2013b) showed that double stranded breaks (DSB), which are simpler 
in design and have higher mutagenesis efficiency, could also be generated using the 
Crispr/Cas9 technology based on the same approach used by Bedell et al. (2012). 
Following these studies, Hruscha et al. (2013) achieved the integration of HA-tags 
into the sequence of single strand oligonucleotides flanked by two short homology 
arms of the targeted gene. Similarly to previously observed integration events, 
insertion of the sequences of interest was detected in most targeted alleles with, 
however, a majority of imprecise and error-prone repair mechanisms. In 2013, Zu 
et al. reported the first HR gene-targeting event using TALENs and a double 
stranded vector containing an eGFP cassette flanked by long homology arms and 
a germ line transmission rate of 1.5%. More recently many other laboratories have 
developed various methods to generate knockin alleles by HR followed by 
CRISPR/Cas9-induced DSB, using as donor single stranded DNA, circular or linear 
plasmids with short (~40 bp) or long (800-1000 bp) homology arms (Hruscha et al. 
2013; Hwang et al. 2013a; Irion et al. 2014; Shin et al. 2014; He et al. 2015; Hisano 
et al. 2015). Although these methods were proven possible, their efficiency remains 
variable. To circumvent these problems, in 2014 our laboratory employed a strategy 
taking advantage of homologous independent repair events shown to be tenfold 
more active than HR events in the one-cell stage embryo (Auer and Del Bene 
2014; Auer et al. 2014). The plasmid donor vector was engineered with an eGFP 
bait cassette and a Gal4 transcriptional transactivator cassette. Co-injected with a 
locus-specific sgRNA, an eGFP targeting sgRNA and cas9 nuclease mRNA, 
cleavage of the donor vector was generated along with the endogenous chromo- 
somal integration site. For better readout, the injection was performed into an 
outcross of two transgenic lines, the first being an eGFP reporter line and the 
second a Tg(UAS:RFP) line. Injected embryos with a successful in-frame integra- 
tion event (most probably through homologous independent repair mechanisms) 
therefore displayed RFP signal in cells where GFP signal was normally detected. 
In this system, the offspring transmission was evaluated at about 30% and 
increased to 40% when a selection for the RFP signal was performed after 
injection. The generation of such a donor vector allowed the direct assessment 
of the efficiency of the strategy by targeting an endogenous locus of the zebrafish 
genome. Targeting the transcriptional starting site of the kifSaa gene, integration of 
the donor vector was successfully induced and shown to be independent from the 
orientation of the sgRNA targeting kifSaa. In addition, no homologous sequences 
between the vector and the endogenous targeted site were required for the integration, 
allowing the re-use of the vector in combination with any given site-specific sgRNA. 
Using the same approach, Kimura et al. (2014) improved the strategy by adding a heat 
shock cassette (Hsp70) upstream of the transcription trans-activator Gal4 cassette 
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Fig. 1 Knockout and knockin strategies based on the Crispr/Cas9 technology in zebrafish. 
Schematic representation of the different methods and applications of Crispr/Cas9-mediated 
genome modifications. From top to bottom: (1) labeling with GFP of cas9-expressing cells 
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into the donor vector, allowing its expression independently from in-frame insertion 
events within the transcriptional starting site of the gene of interest. To date, several 
new reporter lines have been generated using this strategy, providing a powerful 
alternative for homology-independent repair over HR-mediated integration. Key 
points for its success are (1) the identification of efficient sgRNAs targeting the 
chromosomal site of choice, for which new prescreening methods have been devel- 
oped (Carrington et al. 2015; Prykhozhij et al. 2016); (2) the injection of the ssRNA 
mix with Cas9 nuclease mRNA over purified Cas9 protein that seems to prevent the 
donor plasmid insertion; and (3) further screening for the identification of founders 
due to the error-prone nature of junction sites between the endogenous locus and the 
donor vector. Hisano et al. (2015) addressed this last point by introducing 10—40 bp 
homology arms into the donor vector to trigger integration events mediated by HR 
repair mechanisms. In parallel, Li et al. (2015) developed another approach by 
targeting intronic regions of the gene of interest, therefore non-HR dependent. 
While this strategy allows keeping the integrity of the targeted coding sequence, 
the enriched presence of repeat sequences within the introns makes it difficult to 
achieve a specific targeting. Finally, the latest advance in knockin approaches is the 
development of traceable genome editing events that allow the easy recovery of 
edited alleles (Hoshijima et al. 2016) (Fig. 1). 
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Dissecting the Role of Synaptic Proteins 
with CRISPR 


Salvatore Incontro, Cedric S. Asensio, and Roger A. Nicoll 


Abstract A significant step forward in the study of synaptic physiology is the 
application of single cell genetic modifications. In this landscape, the dissection of 
the role of single proteins or, more significantly, their subunits and sub-domains has 
increased enormously the basic knowledge of synaptic function. CRISPR/Cas9 is a 
recently developed genome-editing tool that can be used to inactivate or modify 
genes of interest. Its ease of implementation and affordable cost, combined with its 
high efficiency, make it a very valuable tool to study various biological processes. 
The application of this technique in addition to previous genetic approaches vastly 
simplifies and accelerates the study of specific synaptic proteins. Here we illustrate 
different ways that CRISPR/Cas9 can be used in the study of synaptic properties. 


Introduction 


Over the last two decades, the combination of pharmacology and genetics has been 
instrumental in our current understanding of the molecular mechanisms controlling 
diverse neuronal processes. The development of gene targeting through homolo- 
gous recombination enabled the generation of knockout (KO) transgenic animals, 
and this ability to completely inactivate genes of interest for synaptic transmission 
has provided invaluable information about their function. Although germline gene 
deletion has dramatically advanced our knowledge, the approach suffers from two 
main limitations: the deletion can be embryonically lethal if the gene is essential or 
it can lead to physiological compensation during development, masking the real 
importance of the studied protein. In addition, the development of transgenic 
animals represents a significant investment of both cost and time. 

The more recent development of RNAi has provided an easier and faster way to 
inactivate proteins, but use of this technique is limited by the efficiency of 
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knockdown. Indeed, in the case of incomplete knockdown, residual protein can lead 
to serious misinterpretation. In addition, off-target effects present an important 
concern. Indeed, it has been observed that RNAi manipulation can affect the 
morphology of single spines (Alvarez et al. 2006), suggesting some general 
non-specific effects of RNAi in neurons. 

More recently, the development of conditional knockout (cKO) technology has 
offered an interesting alternative to the limitations associated with both germline KO 
and RNAi approaches. Indeed, the cKO approach relies on the generation of transgenic 
mice with LoxP sites flanking a gene of interest. The subsequent sparse transfection of 
Cre recombinase in brain slices derived from these LoxP animals results in the removal 
of the gene of interest from a few neurons and offers a more controllable way to 
compare genetically manipulated neurons to controls by dual cell patch clamp (Adesnik 
et al. 2008; Pluck 1996; Hayashi et al. 2000; Schnell et al. 2002; Sauer and Henderson 
1988; Tsien et al. 1996). This method is particularly powerful for studying proteins that 
are essential for the maintenance of synaptic equilibrium. For example, this genetic 
inactivation approach has been used successfully to determine the function of single 
subunits of the excitatory post-synaptic AMPA and NMDA receptors (Lu et al. 2009; 
Gray et al. 2011) and the role of the different isoforms of the SNARE protein complex 
machinery at the pre-synapse (Hun et al. 2014; Han et al. 2011; Maximov et al. 2007). 
Nevertheless, the same time and cost considerations associated with the development of 
germline KO animals apply to the Cre-LoxP system. 


Genome Editing Using CRISPR/Cas9 


Genome editing generally relies on the guided activity of endonucleases to generate 
double-strand breaks at a specific location in the genomic DNA in order to modify 
it. In eukaryotic cells, there are two main types of DNA repair mechanism follow- 
ing double-strand DNA breaks: non-homologous end joining (Barnes 2001; Lieber 
2010) and homologous recombinational repair. Non-homologous end joining is 
generally accompanied by the loss/gain of nucleotides such as deletions, insertions 
or nucleotide substitutions in the repaired region, thus often leading to inactivation 
of the targeted gene. On the other hand, homologous recombination uses the 
complementary DNA as a template to repair the double-strand DNA breaks. The 
outcome of this type of repair is generally more precise and controllable, so it can 
be used to either introduce point mutations or knock-in entire proteins through the 
use of a repair template. 

CRISPR/Cas9 is a recently developed genome-editing technique arising from a 
bacterial adaptive defense system against invading plasmids or phages. The term 
CRISPR stands for Clusters of Regularly Interspaced Short Palindromic Repeats. 
These CRISPR loci are found in bacteria and are composed of partially palindromic 
non-coding repeats that are separated by non-repetitive spacers of similar length. 
These repeats and spacers are transcribed into one long RNA transcript that is 
further processed into smaller CRISPR RNAs by endonucleases encoded by 
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CRISPR-associated (Cas) genes flanking the CRISPR loci (Ishino et al. 1987; 
Nakata et al. 1989; Pourcel et al. 2005; Jansen et al. 2002). Each individual CRISPR 
RNA corresponds to one repetitive unit of the original CRISPR array and will guide 
Cas nucleases to their target by recognizing the homologous DNA region. To work 
as a defense mechanism, new spacers deriving from invading plasmids or phages 
are added to the CRISPR locus (Bolotin et al. 2005; Pourcel et al. 2005). Once 
transcribed and processed into CRISPR RNAs, these new spacers then serve as 
memory signatures of past invasions, enabling the bacteria to recognize and cleave 
foreign DNAs (Makarova et al. 2006). 

As a genome-editing tool, the technique relies on the nuclease activity of one of 
these Cas genes (SpCas9) derived from Streptococcus pyogenes. The activity of 
SpCas9 depends on two of these processed RNAs: a CRISPR RNA and a trans- 
activating CRISPR RNA, which combine to form an RNA complex. The critical 
features of this complex are the presence of a double-stranded RNA structure at the 
3’ end that physically interacts with SpCas9 and a 20-nucleotide sequence at the 5’ 
end, which guides the binding of SpCas9 to the target DNA by homology (Jinek 
et al. 2012). In addition, the proper targeting of SpCas9 requires the presence of a 
short sequence of the complementary sequence on the target DNA. This sequence is 
called the protospacer adjacent motif (PAM) and, in the case of SpCas9, consists of 
a nucleotide triplet (NGG). Importantly, in the absence of the PAM, Cas9 cannot 
recognize target sequences even when they are fully complementary to the guide 
RNA (Sternberg et al. 2014). By engineering chimeric single RNAs consisting of a 
fusion between the trans-activating CRISPR RNAs and the CRISPR RNAs, it 
becomes possible to mimic the natural RNA complex and to control the targeting 
of Cast to a specific region of the genome by simply changing the 5’ complemen- 
tary sequence of the RNA complex (Jinek et al. 2012; Jiang et al. 2013). This 
so-called guide RNA consists of 20 nucleotides complementary to the region of 
interest, whose only requirement for its design is the presence of a PAM at the 3’ 
end (on the target DNA). As this motif is very frequent in eukaryotic genomes 
(Wu et al. 2014), it becomes possible to target virtually any gene of interest, making 
CRISPR/Cas9 a very powerful and promising tool for basic research as well as for 
potential therapeutic use. Unlike other genome-editing tools requiring the design 
and generation of specific nucleases for each target site, CRISPR/Cas9 relies on a 
simple two-component system: Cas9 and a target-specific guide RNA. 


Practical Considerations for the Use of CRISPR/Cas9 


The careful design of guide RNAs represents one of the key steps in successful use of 
CRISPR/Cas9. The first step consists of choosing the best region to target within the 
gene of interest and subsequently scanning this sequence for the presence of PAM 
motifs. When selecting guide RNAs, it is important to consider the possibility that the 
non-homologous end joining repair mechanism might lead to in-frame deletions 
resulting from Cas9 cleavage in position —3 from the PAM (see Figs. 1 and 2). If 
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Fig. 1 (a) Timeline of the CRISPR_GRIN/ GluN1 deletion experimentation and scheme of dual 
whole-cell voltage-clamp recording in organotypic hippocampal slices of a biolistically 
transfected pX330 CRISPR_GRIN/ neuron and a neighboring wild-type neuron. (b) Representa- 
tive phase contrast + epifluorescence image of the CA1 region of a hippocampal slice and 
confocal image of a CRISPR_GRIN/ neuron co-transfected with a FUGW-EGFP plasmid. Scale 
bar: 20 uM. (c) Sample traces of NMDAR-evoked EPSCs, from a transfected CRISPR_GRINI 
neuron and a neighboring control in the presence of NBQX (10 uM). (d) Targeted GRIN/ region 
and types of insertions or deletions in the DNA after infecting dissociated hippocampal neurons 
with lentiCRISPR GRIN/ (adapted from Fig. 1 of Incontro et al. 2014) 


structural information about the protein is available, it can be used to select a region that 
is essential for its stability. Unfortunately, for most proteins this information does not 
exist, and the best strategy to efficiently inactivate the gene of interest is usually to 
target one of the first exons in order to minimize the chance of generating a truncated, 
functional protein. When potential 20 bp sequences have been selected, several on-line 
tools enable users to find sequences with the lowest probabilities for off-target effects 
based on their lack of similarities to other parts of the genome. 

The rescue experiments also provide a powerful tool to assess the role of specific 
protein domains. By transfecting cDNAs with point mutations or domain deletions, it 


Dissecting the Role of Synaptic Proteins with CRISPR 55 


ue æ GRINT gRNA} EFS + GFP 
A B 
NMDAR eEPSC NMDAR eEPSC 
60 
T 2 
3 5 
= 
= ae 
e 3 
D 
S S % 
Ir zZz 
O 
o 
(o) 50 100 5 10 15 
c Control (pA) Days of transfection 
% of control | 46.3% 40.6% 28% 0% 0% 
days in vitro 2days 6 days 8 days 11days 13days 16 days 
U6 GRIN1 gRNA, EFS + GluN1 cDNA 
2 y a 
3'-TAGGATAGCGTAGACctgtggacacagg-5' 
_— as | 
Exon Intron 
E F 
NMDAR eEPSC AMPAR eEPSC 
150 ` Së 
z — Z g 
y ' = 200 
S" S Q 
e E o 9, 
E CN 
Bs Em. 
& o o 
o 0 = 
o 50 100 150 fo) 100 200 300 
Control (pA) Control (pA) 


Fig.2 (a) Scatterplot and sample traces of NMDAR eEPSCs in 14 days transfected CRISPR/Cas9 
and neighboring control neurons. Open circles represent amplitudes of NMDA EPSCs for single 
cells; filled circle represent the mean. (b) Time course of NMDAR eEPSC 5, 7, 10, and 15 days 
after transfection. The evoked currents are eliminated after 12 days. (c) Scheme of the time course 
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becomes possible to assess their significance for the biological process being studied, 
similar to what has been done with conditional KO animals (Herring et al. 2013). 


The Use of CRISPR/Cas9 in Neurons: Proof of Concept 


To test the potential of the CRISPR/Cas9 technology in neuroscience, we have 
performed a proof of concept study aimed at assessing its efficiency to inactivate 
synaptic proteins. In particular, we focused on two fundamental subunits of the 
ionotropic glutamate receptors in hippocampal slice cultures: the GluN1 subunit of 
NMDA receptors and the GluA2 subunit of AMPA receptors. We began by 
designing two different guide RNAs targeting the extracellular part of the GluN1 
subunit, and we selected guide RNAs with a score >70% corresponding to a low 
probability of off-target effects according to the MIT online CRISPR design tool. 
We then co-introduced by biolistic transfection into hippocampal slices a plasmid 
encoding both Cas9 and one of the guide RNAs together with a plasmid encoding 
GFP as described previously for cKOs (Adesnik et al. 2008). As the efficiency of 
this transfection approach is modest, the system as a whole is only minimally 
perturbed and it becomes possible to directly compare recordings obtained simul- 
taneously from a target, transfected neuron (GFP positive) and a control, 
untransfected neighbor neuron (GFP negative; Fig. la, b). NMDA currents 
(eEPSCs) were completely abolished in 100% of the pyramidal neurons analyzed 
(Fig. 1c). Consistent with previous results (Adesnik et al. 2008), we also observed a 
compensatory increase in AMPA currents. We sequenced the DNA region targeted 
by Cas9 after PCR amplification of the genomic DNA and found the presence of 
various small insertions and deletions (indels) creating frameshifts in 90% of the 
cases (Fig. 1d). This first set of experiments thus suggests that Cas9 is able to 
efficiently inactivate genes in adult pyramidal neurons by creating double-strand 
DNA breaks, which are repaired by the non-homologous end joining system. The 
extreme efficiency that we observed contrasts with the efficiency reported by others 
using different cell types and is somewhat surprising, but probably reflects the post- 
mitotic nature of adult pyramidal neurons. In contrast to dividing cells, which 
rapidly dilute the Cas9 machinery, neurons have the ability to maintain high levels 
of the CRISPR/Cas9 components for a longer period. Under these conditions, Cas9 


Fig. 2 (continued) and percentages of control of NUDAR-evoked EPSCs after CRISPR_GRIN1 
biolistic transfection. (d) Scheme of the targeted region in the GRIN/ gene. The guide RNA not 
including the PAM region is shown in bold; the intronic part of the gene, which includes the PAM 
region, is shown in blue. (e) Scatterplot and sample traces of NMDA eEPSCs from a transfected 
CRISPR_GRIN/ neuron + GluN1 cDNA and a neighboring control neuron. Scale bar: 50 pA and 
50 ms. (f) Scatterplot and sample traces of AMPA eEPSCs from a transfected CRISPR_GRINI 
neuron + the GluN1 cDNA and a neighboring control neuron. Scale bar: 50 pA and 50 ms 
(Adapted from Figs. 2 and 3 of Incontro et al. 2014) 
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will presumably have sufficient time to cut the targeted region until it can no longer 
be properly repaired. 

To rule out the existence of off-target effects, we also performed rescue exper- 
iments by transfecting a GluN1 cDNA. Re-introduction of the deleted subunit by 
co-transfection fully rescued the phenotype (Fig. 2). 

In a subsequent part of our project, our goal was to target multiplex genes. As a 
proof of concept we repeated the same experiment with the single GluN1 and GluA2 
subunits, this time co-transfecting the two plasmids containing target gRNAs. We 
observed a complete deletion of both subunits with a complete rectification for 
AMPA receptors (due to the loss of GluA2) and no NMDA eEPSCs (Fig. 3). 

Another issue regards the possibility of studying an effect of a protein deletion 
in vivo. The advent of Cas9 opens a very exciting new concept—we can now inject 
gRNAs to target potentially any protein in a wild-type (WT) background 
(co-transfecting with Cas9 plasmids) or in Cas9 knock-in animals (Platt et al. 
2014). For example the use of the AMPA receptors triple floxed mouse has been 
very important for understanding every single subunit’s contribution to the struc- 
ture and function of glutamatergic excitatory synapses. Now we can reproduce 
these results in a few weeks (compared to years to create Cre-Flox lines and to cross 
them), optimizing time and cost enormously (Fig. 4). 

The possibility of expressing the protein of interest in a KO background enables 
one to study the function of specific domains in the synaptic context. This approach 
can be instrumental to the understanding of synaptic proteins that are involved in 
neurological diseases. 


Conclusions and Future Perspectives 


The field of biological engineering has seen the rapid development of several novel 
technologies over the last few years, and neuroscience has embraced many of them 
to explore the function of synaptic proteins in a more precise and definitive way. 
Recent development of the CRISPR/Cas9 technology provides a simpler and faster 
alternative for studying synaptic proteins by removing the time and cost associated 
with the generation of genetically manipulated animals. Indeed, the approach can 
be used for the inactivation of target genes, but it also enables one to determine the 
significance of particular protein domains by performing rescue experiments, as 
discussed above. In addition, it is possible to place the expression of Cas9 under the 
control of a neuronal specific promoter for use in vivo, similar to what has been 
done with Cre previously (Gray et al. 2011; Lu et al. 2009; Schnell et al. 2002). 
Finally, another powerful feature of the CRISPR/Cas9 technology is the ability to 
easily inactivate several proteins at once using multiplex guide RNAs. 

How can one determine that the cleavage has indeed happened? The importance 
of this validation is best illustrated in a recent short report (Straub et al. 2014) in 
which the authors performed in utero electroporation in mice to inactivate GluN1. 
Recording from hippocampal slices of 2-week-old mice, they observed a total 
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Fig. 3 (a) Scheme of the CRISPR plasmids modified to target specifically GRIN/ and GRIA2 and 
the time course of the transfection period before recording. (b) Sample trace and paired average 
NMDA-evoked EPSCs of single pairs from control and transfected cells. NMDA currents are 
completely eliminated after 10 days transfection. (c) AMPAR-evoked EPSCs summary of 
CRISPR_GRINI and double CRISPR_GRIN/ &GRIA2. Bar graph indicates the rectification 
index mean values for the two conditions. The double CRISPR conditions present a fully rectified 
phenotype typical of GluAl monomeric receptors 


elimination of NMDA currents with one guide RNA whereas the other guide RNA 
tested had no effect at all. This finding underlines the importance of guide RNA 
design and the necessity to validate these guide RNAs. In many ways, these 
considerations are not specific to CRISPR/Cas9 and are also true for Cre-Lox and 
RNAi approaches. 
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The generation of a Cre-dependent Cas9 knock-in mouse might also become a 
very useful tool for neuroscientists (Platt et al. 2014). By injecting AAV driving the 
expression of Cre and of a guide RNA targeting NeuN in the brain of the Cas9 
mouse, the authors observed the formation of on-target indels in the infected region 
accompanied by an 80% reduction in NeuN protein levels. By enabling the inac- 
tivation of genes either in vivo or in isolated primary cells, this mouse model will 
surely serve as a versatile tool and could potentially be used as a platform for 
genome-wide screens. 

Finally, the combination of Cas9 with Sun-TAG technology enables the user to 
activate the expression of specific genes (Tanenbaum et al. 2014). The system is 
based on the recruitment of multiple copies of gene regulatory effector domains to a 
nuclease-deficient CRISPR/Cas9 protein targeted to specific sequences in the 
genome. CRISPR can thus be used not only to delete synaptic proteins but also to 
turn on their endogenous expression. 

Most recent works on CRISPR/Cas9 systems evidence the importance of an 
optimized system. In particular modifications are due to the necessity of developing 
a possible human delivery system containing Cas9. Indeed, the switch to SaCas9 
(from Staphylococcus aureus), which is much smaller, and the introduction of 
specific mutations to increase the specificity of Cas9 endonuclease cut, are exam- 
ples of this race to new drug development (Ran et al. 2015). Furthermore, the 
introduction of specific mutations in the SpCas9 sequence has significantly 
enhanced the specificity of this enzyme. Thus, this improvement has reduced to a 
minimum the possibility of off-target effects, extending the applications of SpCas9 
for genome editing (Slaymaker et al. 2016). 

The use of CRISPR in neuroscience should be considered simply as a new tool, 
in particular for the time and cost reduction in the genetic manipulation of synaptic 
genes. In labs all around the world the introduction of CRISPR may not add 
anything really new regarding the final result but it can importantly simplify the 
work (Fig. 4). 
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Recurrently Breaking Genes in Neural 
Progenitors: Potential Roles of DNA Breaks 
in Neuronal Function, Degeneration 

and Cancer 
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Abstract The repair of mammalian DNA double-strand breaks (DSBs) by classi- 
cal non-homologous end joining (C-NHEJ) suppresses genomic instability and 
cancer and is required for development of the immune and nervous system. We 
hypothesize that proper repair of neural DSBs via C-NHEJ or other end-joining 
pathways is critical for neural functionality and homeostasis over time and that 
improper DSB repair could contribute to complex psychiatric and neurodegenera- 
tive diseases. Here, we summarize various findings made by our laboratory and 
others over the years that support this hypothesis. This evidence includes, most 
recently, our discovery of a set of genes, of which most serve neural functions, that 
can serve as targets of recurrent DSBs in primary neural stem and progenitor cells. 
We also present a speculative model, based on our findings, of mechanisms by 
which recurrent DSBs in neural genes can generate neuronal diversity and contrib- 
ute to neuropsychiatric disease. 


Early studies revealed that the lymphocyte-specific V(D)J recombination reaction 
involves the introduction of DNA double-stranded breaks (DSBs) at the ends of 
antigen receptor V, D, and J gene segments, followed by the processing of the 
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generated ends and subsequent fusion of the DSB ends of the different types of gene 
segments to form V(D)J variable regions exons (Alt and Baltimore 1982). The 
Baltimore lab discovered the lymphocyte-specific endonuclease (RAG) that gener- 
ates V(D)J DSBs (Schatz and Swanson 2011). Based on screens of DNA repair- 
mutant Chinese hamster ovary cell lines, we discovered that the end-joining phase 
of V(D)J recombination is carried out by a multi-component DSB end-joining 
pathway (Taccioli et al. 1993). We went on with collaborators to identify many 
of the various components of the “classical” non-homologous end-joining 
(C-NHEJ) pathway, including discovering the XRCC4 “core” C-NHEJ factor, 
based on our finding that this factor restores the ability of a DNA repair-defective 
Chinese hamster ovary cell line to undergo the joining phase of V(D)J recombina- 
tion (Li et al. 1995). 

To evaluate potential physiological functions of XRCC4 and other C-NHEJ 
factors newly discovered at the time, or other putative C-NHEJ factors, we 
inactivated the genes encoding them in mice (Sekiguchi et al. 1999; Ferguson and 
Alt 2001). Mice in which we inactivated the XRCC4 C-NHEJ factor, or its 
interaction partner DNA Ligase 4 (Lig4), had essentially identical phenotypes. 
These phenotypes included, most notably, abrogation of both lymphocyte and 
neuronal development due to unrepaired DSBs that occurred at the progenitor 
stage (Frank et al. 1998; Gao et al. 1998). It is striking that the development of 
lymphocytes and neurons was the most clear-cut defect in these C-NHEJ-deficient 
mice. As discussed below, XRCC4- or Lig4-deficient mice routinely die late in 
embryonic development, most likely due to their neuronal developmental defects. 
At this stage, effects on fetal lymphocyte development can still be assessed. 

Lymphocyte development is blocked at the progenitor stages in these core C-NHEJ- 
deficient backgrounds due to the inability to join V(D)J recombination-associated 
DSBs generated by the RAG endonuclease in the absence of core C-NHEJ factors 
(Alt et al. 2013). Thus, progenitor B and T lymphocyte development was completely 
abrogated due to the inability to, respectively, assemble functional antibody and T cell 
receptor genes that are needed for further development of the B and T cell lineages. As 
V(D)J recombination occurs at the G1 cell cycle stage, core C-NHEJ-deficient pro- 
genitor lymphocytes correspondingly undergo apoptosis due to a response to their 
unrepaired V(D)J DSBs that is mediated by the p53 G1 check-point response factor 
(Frank et al. 2000; Gao et al. 2000; Zhu et al. 2002). In this regard, p53 deficiency, in 
fact, rescues the embryonic lethality of XRCC4- or Lig4-deficient mice but does not 
rescue lymphocyte development because V(D)J joining is still abrogated. The allevi- 
ation of the p53 response to unrepaired RAG-generated DSBs at antigen receptor genes 
allows XRCC4- or Lig4-deficient progenitor lymphocytes to survive and enter the cell 
cycle, resulting in XRCC4/p53-deficient mice that rapidly develop lethal pro-B cell 
lymphomas (Frank et al. 2000; Gao et al. 2000). These C-NHEJ/p53-deficient pro-B 
lymphomas all harbor recurrent translocations that fuse RAG-initiated DSBs at the [gH 
locus to DSBs downstream of c-Myc (Zhu et al. 2002), with many likely initiated at 
cryptic RAG off-targets sites in the c-Myc downstream region (Hu et al. 2014; 
Tepsuporn et al. 2014). Notably, however, even though core C-NHEJ-deficient/p53- 
deficient mice die from recurrent pro-B lymphomas, many of them harbor 
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medulloblastomas in situ at the time of their death from pro-B lymphoma (Zhu et al. 
2002). Finally, conditional inactivation of Xrcc4 in p53-deficient B cells leads to 
mature B lymphomas with recurrent translocations involving DSBs initiated by the B 
cell-specific activation-induced cytidine deaminase (AID) during IgH class switch 
recombination (CSR, see below) that are joined to upstream regions of the c-Myc 
gene (Wang et al. 2009). 

Our studies demonstrated that XRCC4- or Lig4-deficient neuronal progenitor 
cells undergo apoptosis throughout the nervous system at a developmental time 
when particular neuronal progenitor populations differentiate into postmitotic neu- 
rons (Gao et al. 1998). Moreover, we implicated p53 checkpoint-initiated apoptosis 
in response to unrepaired DSBs that occurred in the neuronal progenitors as a 
mechanism for this death of newly differentiated neurons, as demonstrated by our 
finding that such neuronal apoptotic death could be rescued by p53 deficiency. In 
this regard, the postnatal survival of XRCC4-deficient or Lig4-deficient mice 
conferred by p53 deficiency has been speculated to be due to rescue of newly 
differentiated neurons with unrepaired DSBs (Sekiguchi et al. 1999). However, the 
potential effects of such unrepaired DSBs on neuronal functions in these mice could 
not be assessed due to their rapid death from pro-B cell lymphomas; thus, the 
potential roles of these implied DSBs in neuronal development and neuronal 
functions remained speculative. In this regard, a lingering question was the location 
of the genomic sites of the involved DSBs. 

As mentioned above, C-NHEJ/p53 double-deficient mice all develop progenitor 
B cell lymphomas with recurrent translocations between the IgH and c-Myc genes, 
whereas p53-deficient mice with Xrcc4 conditionally inactivated in B-lineage cells 
develop mature B-lineage tumors with translocations between IgH and c-Myc but 
also translocations of other antigen receptor loci (Wang et al. 2008, 2009). Thus, we 
attempted to identify recurrently breaking genomic sites in neural progenitor cells 
by conditionally inactivating Xrcc4 in neuronal stem and progenitor cells in a 
p53-deficient background. Strikingly, we found that such conditional inactivation 
of Xrcc4 in p53-deficient neural progenitors routinely led to medulloblastomas 
(MBs) with recurrent translocations on several different chromosomes and frequent 
chromosomal or extrachromosomal amplification of the N-myc gene (Yan et al. 
2006). These N-myc amplifications were reminiscent of those we found in human 
neuroblastomas in the process of discovering N-myc (Kohl et al. 1983). While the 
findings supported our original hypothesis that recurrent DSBs in the vicinity of 
N-myc (or other frequently translocated regions in MBs) could predispose to such 
translocations and amplifications, the resolution available from our studies at that 
time did not allow mapping of potential fragile break sites. 

Together, our prior studies revealed that DSB repair by C-NHEJ in neural stem 
and progenitor cells (NSPCs) is required for nervous system development and for 
suppressing childhood brain tumors (Gao et al. 1998; Yan et al. 2006). These 
studies also raised the interesting possibility of potential parallels between func- 
tional outcomes of DSB generation and repair in lymphocytes and neuronal pro- 
genitor cells. More recently, studies by others have shown that mature brain cells 
contain frequent genomic alterations that have been speculated to contribute to 
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neuronal diversity and disease (McConnell et al. 2013; Poduri et al. 2013; 
Weissman and Gage 2016). In this regard, beyond inherited germline mutations, 
somatic, “brain only”, mutations have been implicated in neurodevelopmental and 
neuropsychiatric disorders (Poduri et al. 2013). However, the potential causes of 
genomic alterations in brain cells continued to remain largely unexplored and 
speculative. Based on our observations regarding the effects of C-NHEJ deficiency 
on neuronal development and neuronal disease, namely cancer, we sought to 
develop and employ new technologies to test the hypothesis that genomic alter- 
ations in mature brain cells and some variations connected to neuropsychiatric 
diseases might originate from DSBs in NSPCs. 

Over the past decade, since our discoveries of the potential roles for DSBs in 
neuronal diversity and disease, we have developed and enhanced a high- 
throughput, genome-wide translocation sequencing (HTGTS) approach to rapidly 
and highly sensitively identify DSBs genome-wide based on their translocation to 
bait DSBs (Chiarle et al. 2011; Frock et al. 2015; Hu et al. 2016). For this approach, 
bait DSBs can be introduced ectopically by designer endonucleases (Chiarle et al. 
2011; Hu et al. 2014; Meng et al. 2014; Frock et al. 2015) or recurrent endogenous 
DSBs can be used as bait, including those initiated by AID during IgH CSR (Dong 
et al. 2015) or by RAG during V(D)J recombination (Zhang et al. 2012; Hu et al. 
2015; Zhao et al. 2016). 

Our studies have shown that various classes of DSBs, including those induced 
ectopically by ionizing radiation, show a much greater preference to join to other 
DSBs within the same topological domain due to proximity effects associated with 
the spatial genome organization of chromatin domains (Zarrin et al. 2007; Zhang 
et al. 2012; Alt et al. 2013; Frock et al. 2015). As two random DSBs rarely occur 
within the relatively short genomic distances within a chromosomal domain, which 
is often a megabase or less, this phenomenon most greatly impacts the joining of 
closely linked recurrent DSBs (Alt et al. 2013). Our HTGTS studies provided 
additional insights into our prior finding (Zarrin et al. 2007; Gostissa et al. 2014) 
that indicated that CSR joining exploits the predisposition of high frequency DSBs 
within topological domains to be joined to each other to achieve physiological 
joining levels (Zarrin et al. 2007; Dong et al. 2015). We also showed that, during 
V(D)J recombination, RAG exploits chromosomal loop domains to not only achieve 
high joining frequency but also to developmentally restrict its activity directionally 
within a loop domain (Hu et al. 2015; Zhao et al. 2016). 

To identify the sources and functions of neural DSBs, we applied our HTGTS 
DSB identification approach to cultured, primary mouse NSPCs. For these HTGTS 
studies, we employed ectopically generated bait DSBs on several different chromo- 
somes to search for significant, recurrent clusters of DSBs genome-wide that joined 
to bait DSBs on more than one chromosome. These studies identified 27 recurrent 
DSB clusters (“RDCs”) in the NSPC genome, all of which were enhanced by mild 
replication stress via treatment with aphidicolin, a compound that inhibits replica- 
tion (Wei et al. 2016). Strikingly, all 27 of these RDCs lie within genes, most of 
which encode surface proteins involved in synaptogenesis and related neural pro- 
cesses (Wei et al. 2016). Moreover, variations of most RDC genes also have been 
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implicated in neuropsychiatric disorders, including schizophrenia and autism, and 
many are rearranged in cancers, including brain cancers such as medulloblastoma 
(Wei et al. 2016; Weissman and Gage 2016). Notably, human counterparts of 9 of 
the 27 NSPC RDC genes occurred in copy number variations (CNVs) found in 
individual human frontal cortex neurons (McConnell et al. 2013), suggesting that 
NSPC RDC DSBs could contribute genomic variations in mature neurons (Weiet al. 
2016; Weissman and Gage 2016). 

RDC gene transcriptional and replication characteristics suggest that their fre- 
quent DSBs could occur during collisions between RNA and DNA polymerases 
associated with mild replication stress (Wei et al. 2016). RDC gene DSBs appear to 
occur very frequently across the body of RDC genes, which generally are very long 
(up to 2 Mb in length) with relatively small exons and which also potentially often 
lie within topological domains (Wei et al. 2016). As HTGTS maps only those bait 
DSBs that translocate, local RDC DSB frequency may be much higher than the 
estimated minimal frequency of 12 RDC translocations per NSPC that we estimated 
via translocation junction capture via HTGTS (Wei et al. 2016). Indeed, we have 
estimated that the frequency of DSBs across long RDC genes, while of lower 
density than CSR DSBs, approach the same order of magnitude in numbers per 
gene as CSR DSBs in B lymphocytes during IgH CSR (Wei et al. 2016). Notably, 
because most of the RDC gene sequences are within introns, most of the RDC DSBs 
also occur within introns as opposed to within exons (Wei et al. 2016). 

By analogy to mechanisms of lymphocyte-specific recombination (Dong et al. 
2015; Hu et al. 2015), we propose that many DSBs that occur within RDC genes 
would be joined to other DSBs within the same RDC gene (Wei et al. 2016). Thus, 
we further propose that frequent RDC gene DSBs, which again mostly occur within 
introns, may be joined to shuffle exons and, thereby, contribute to neural cell 
diversity (Fig. 1). Such breakage and joining events may also have the potential 
of contributing to disease-associated neural gene alterations (Wei et al. 2016; 
Weissman and Gage 2016). 

A number of RDC genes, for example, the neurexins (Treutlein et al. 2014), are 
thought to produce numerous isoforms via differential RNA processing. Beyond 
such a diversification mechanism, we propose that RDC-based recombination, by 
generating exon deletions, might “hard-wire” expression of variant RDC products 
in NSPCs and, thereby, contribute to neural diversity. Our current findings suggest 
that such putative activities would occur in NSPCs and the products of recombina- 
tion events would be carried on into mature neurons; in this regard, the process 
would be somewhat analogous to V(D)J recombination. However, the actual exon 
shuffling mechanism we propose would be more similar to IgH CSR, creating 
different isoforms of the protein rather than creating new exons (Fig. 1). In this 
scenario, the evolution of long, neural genes that are largely comprised of intronic 
sequences into which are embedded small exons (Smith et al. 2006) could have 
evolved to provide large target introns for more random stress-associated DSBs in 
NSPC development. This would be a different solution to the problem of targeted 
exon shuffling than that employed by CSR, in which DSBs are introduced into 
specialized intronic switch region sequences (Fig. 1). Whether or not the processes 
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Fig. 1 Top panel Diagram of the IgH class switch recombination reaction as illustrated by 
switching from IgM to IgG1. The /gH locus is contained with a topological domain (TAD). In 
activated B cells, switching from IgM to IgG1 results from an exon shuffling process in which the 
V(D)J exon is first expressed with Cy to generate IgM but, upon activation, DSBs initiated by AID 
in repetitive switch (S) regions upstream of Cu and Cy1 are joined by C-NHEJ to delete Cu and 
replace it with Cyl. This recombination/deletion exon shuffling process allows the same V(D)J 
exon to be expressed with a different C exon (For other details, see text or Alt et al. 2013). Bottom 
Panel Diagram of a hypothetical RDC DSB-based exon shuffling mechanism to allow expression 
of different isoforms of RDC genes to be expressed by “hardwiring” potential somatic splice 
variants by deletional recombination. This model is based on the finding that at least some RDC 
genes lie within TADs and that RDC DSB frequency upon replication stress may approach that of 
IgH S regions, allowing ends of different RDC DSBs within the same gene to be frequently joined, 
based on their proximity within the same topological domain. This model could offer one 
explanation for why many neural genes are very large and embedded with relatively small 
exons (Smith et al. 2006): namely, as these genes are mostly comprised of intronic sequences, 
most “randomly” introduced RDC DSBs across them fall within intronic sequences rather than in 
exons, providing a basis for a replication stress-associated DSB diversification mechanism. If so, 
whether or not requisite replication stress is somehow programmed during NSPC development 
remains to be addressed (See text or Wei et al. 2016 for other details) 


that generate RDC genes are specialized to the neural lineage will require further 
investigation, as will the question of whether enhanced replication stress at the stem 
and progenitor development stages during neural development could, via an 
RDC-based mechanism, contribute to neural disease. 

RDCs also potentially provide a mechanistic basis for many common fragile sites 
and certain CNVs, which may result from transcription/replication collisions in 
generating DSBs or other lesions (Glover and Wilson 2016; Wei et al. 2016). Two 
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NSPC-RDC genes, CDH13 and NRXN3, are within recurrent CNVs in human MBs 
(Northcott et al. 2012; Rausch et al. 2012) and several candidate RDCs lie proximal to 
mouse N-myc (Wei, Schwer and Alt, unpublished data). It is possible that RDCs 
contribute to recurrent genomic variations we and others have found in MBs (Yan 
et al. 2006), which may offer a mechanism to support the speculation from long ago 
that proximal, recurrent DSBs during neuroblast differentiation contribute to N-myc 
amplification in human neuroblastomas (Kohl et al. 1983). A number of the 27 identi- 
fied NSPC RDC-genes undergo somatic genomic rearrangements, including deletions, 
amplifications, and translocations in various types of cancer (see Wei et al. 2016), and 
some undergo CNVs in embryonic stem cells and fibroblasts (Wilson et al. 2015; 
Glover and Wilson 2016). Our HTGTS analysis of additional cell types could identify 
potential spontaneous or replication stress-induced RDCs in other cell types and, more 
generally, could shed light on the mechanisms underlying the genetic variations in a 
range of cancers. 
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Abstract The common marmoset (Callithrix jacchus) is a small New World 
non-human primate indigenous to northeastern Brazil. This species has been attracting 
the attention of biomedical researchers and neuroscientists for its ease of handling and 
colony maintenance, unique behavioral characteristics, and several human-like traits, 
such as enriched social vocal communication and strong relationships between parents 
and offspring. Its high reproductive efficiency makes it particularly amenable for use in 
the development of transgenic and genome editing technologies in a non-human 
primate model. Our group has recently generated transgenic marmosets with germ 
line transmission, opening new avenues in primate research. 

In this chapter, we describe recent advances in neuroscience and disease research 
using common marmosets, and we outline potential uses of genome editing in 
non-human primates toward the development of knock-in/knock-out marmosets. 


Introduction 


Rodent models have long played important roles in neuroscience and medical 
research, made possible in part by the advent of robust genetic technologies. 
Knock-out/knock-in mouse models have shown particular utility in the neurosci- 
ences. There are nonetheless substantial anatomical, physiological, and cognitive 
differences between rodents and humans. The human brain consists of two major 
functional domains, one that is evolutionarily conserved and a second that is 
primate-specific and the locus of many higher cognitive functions. For many 
human neurological and psychiatric diseases involving higher cognitive dysfunc- 
tions, studies using rodent models may thus not be informative with respect to the 
relevant pathophysiological mechanisms. To gain a better understanding of the 
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pathogenesis of such diseases, we need animal models that exhibit brain functions 
more closely similar to those in humans. 

This need has led to increased interest in the development of genetically 
engineered non-human primates for use in the study of both functional domains. 
Our group has recently generated a transgenic common marmoset, a New World 
monkey (Sasaki et al. 2009). Emerging genome editing techniques are also opening 
new possibilities for the creation of better non-human primate models for use in the 
study of neurodegenerative and mental disorders (Izpisua Belmonte et al. 2015). 

This chapter is an updated and modified version of previously published review 
articles on marmosets (Okano et al. 2016; Kishi et al. 2014) and work presented by 
Hideyuki Okano at the “Genome Editing in Neurosciences” symposium. 


Characteristics of the Common Marmoset 


Common marmosets (Callithrix jacchus) are New World primates native to the 
Atlantic coastal forests of northeastern Brazil (Abbott et al. 2003; Carrion and 
Patterson 2012; Mansfield 2003; Okano et al. 2016; Tokuno et al. 2012; Kishi et al. 
2014; Izpisua Belmonte et al. 2015). These small monkeys (adult height: 20-30 cm; 
weight: 350-400 g) have ear tufts and relatively long banded tails, and they are 
omnivorous, eating plant exudates, lizards, and infant mammals. Common marmo- 
sets are monogamous and, unlike many other non-human primates, live in stable 
families of approximately ten members (Tardif et al. 2003). Females commonly 
give birth to two babies per litter and are ready to breed again about 10 days after 
giving birth; they typically have two litters per year. Since mothers need to nurse 
infants during gestation and the perinatal period, the male partner and other 
members of the group also provide infant care. This remarkably human-like trait 
is a focus of attention among neuroscientists and behavioral scientists. 

Although common marmosets have been used for biomedical research since the 
1960s, macaque monkeys are more widely used in research, due to their closer 
similarity to humans. The recent rapid advances in genome editing are now calling 
new attention to the advantages offered by the marmoset because of its size, 
availability, and high reproductivity. 

Macaques are evolutionarily closer to humans than common marmosets, but 
some marmoset traits are more similar to those of humans, perhaps due either to 
geographical segregation or convergent evolution. New World primates are esti- 
mated to have diverged from Old World primates ~35 mya, and these monkeys 
have adapted to neotropical environments. Despite this phylogenetic distance, 
common marmosets, like humans, exhibit strong intergenerational kin relationships 
and social vocal communications (Dell’Mour et al. 2009; Eliades and Wang 2008; 
Gordon and Rogers 2010), which may indicate a convergent trajectory in their 
evolution. The genomic basis of the origins of such traits may be addressable 
through genome editing studies in the future. 
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Advantages of Using Common Marmosets for Biomedical 
Research 


Rodents play a crucial role in biomedical investigations in many research fields. 
Powerful genetic tools, such as knock-out/knock-in mice, have informed the study 
of gene functions, but the significant anatomical and physiological differences 
between rodents and humans mean that a more closely similar animal model is 
needed to advance our understanding of human biology in areas such as the 
neurosciences. 

For biomedical use, the common marmoset offers many advantages. Marmoset 
endocrinology and metabolism are more similar to those of humans than of rodents, 
which is important in pharmacological and toxicological studies of new drug 
candidates. The marmoset is also more closely phylogenetically related to humans 
(Kitamura et al. 2011; t’Hart et al. 2003, 2012). In Europe, the marmoset is now 
being used as a non-rodent second species in drug safety tests (Smith et al. 2001). 

The common marmoset can be handled with greater ease than many other 
non-human primates. Along with the appropriateness of the model to the research 
question, animal welfare and availability are important factors in selecting a model 
species. Marmosets are readily obtained for laboratory use and, as distinct from 
macaques, have not been reported to carry herpes b virus (Macacine herpesvirus 1), 
providing a safety benefit to researchers and animal facility staff (Mansfield 2003). 
The small size of marmosets is also beneficial as it reduces costs and floor space 
requirements (Smith et al. 2001). 

Common marmosets are among the most highly reproductive of all primates. 
The ovarian cycle is approximately 28 days, similar to that in human (Summers 
et al. 1985). The gestation period is approximately 145-148 days. Female animals 
are ready to breed again 10 days after delivery. Usually, female marmosets have 
two litters per year, which is strongly advantageous when compared to macaques, 
which require 5 years to sexual maturation and breed only once per year (Austad 
and Fischer 2011). The remarkable reproductive efficiency of marmosets is 
extremely well-suited to the development of transgenic and genome editing 
techniques. 

Lastly, a number of basic research tools have been developed for use in marmo- 
sets, which is important for encouraging broader adoption by the scientific commu- 
nity. Although the annotated sequencing of its genome has not been completed, a 
draft sequence with 6x coverage using whole-genome shotgun sequencing is avail- 
able on GenBank (The Marmoset Genome Sequencing and Analysis Consortium 
2014; URL: https://www.hgsc.bcm.edu/content/marmoset-genome-project). 

Our group has also sequenced the marmoset genome using animals from the 
colony maintained by the Central Institute for Experimental Animals (CIEA) in 
Kawasaki, Japan (Sato et al. 2015). Resequencing and assembly of the genome 
were performed by deep sequencing with high-throughput sequencing technology 
using a next-generation sequencer, giving approximately 60x coverage. This 
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enabled us to generate genome assemblies and gene-coding sequence analysis more 
efficiently and provided a basis for genome editing. 

We have also applied non-invasive imaging methods in marmoset research. The use 
of marmosets in such studies is limited to small numbers due to cost and ethical issues. 
Magnetic resonance imaging (MRI) is a non-invasive imaging technique to visualize 
various organs in detail. We have adapted a number of MRI techniques, including 
diffusion tensor tractography (DTT; Fujiyoshi et al. 2007; Hikishima et al. 2015) and 
voxel based morphometric (VBM) analysis (Hikishima et al. 2011, 2015), and a new 
method for the visualization of myelin (Myelin Map; Fujiyoshi et al. 2016). 


Transgenic Techniques and Genome Editing Technology 
for Marmoset Research 


One of the strengths of the mouse model is the availability of powerful genetic 
tools, such as transgenic and knock-in/knock-out animals, that have given the 
mouse a central place in life sciences research over the past two decades. However, 
results from mouse genetics are not always directly relevant to humans. Particularly 
in the neurosciences, there are considerable interspecies differences in brain anat- 
omy and physiology, behavioral control mechanisms, and life span, and some 
mouse disease models do not recapitulate human symptoms. For example, neuro- 
fibrillary tangles, the neuropathological hallmarks of Alzheimer’s disease, cannot 
be recapitulated in mice showing amyloid plaques (Chin 2011; Games et al. 1995; 
Hsiao et al. 1996; Sturchler-Pierrat et al. 1997; Tanzi and Bertram 2005; Walsh and 
Selkoe 2004). It is also known that mice in which parkin, the gene associated with 
familial Parkinson’s disease in humans, has been knocked out do not show 
parkinsonism. 

Despite the scientific demand for research in non-human primates, efforts to 
generate transgenic non-human primate animals have been unsuccessful until 
recently. In 2008, Yang et al. (2008) reported a transgenic rhesus macaque 
expressing the human huntingtin (HTT) gene with a CAG-expansion encoding 
the poly glutamine as a model of Huntington’s disease. However, despite the 
genomic insertion of the human H7T-transgene in the founder monkeys, germline 
transmission of the transgene has not been confirmed. Our group independently 
generated transgenic common marmosets expressing the enhanced GFP (EGFP) 
gene and we reported the first germline transmission in a non-human primate 
(Sasaki et al. 2009). 

While the establishment of transgenic marmosets enables the generation of mar- 
moset models of diseases caused by overexpression of a relatively small mutant gene, 
such as Parkinson’s disease, Alzheimer’s disease, and amyotrophic lateral sclerosis 
(ALS), transgenic techniques limited our ability to genetically modify non-human 
primates. Transgenic technologies available at the time could only randomly insert 
only <8 kb of exogenous genes into the genome (Sasaki et al. 2009). Moreover, 


Neuroscience Research Using Non-human Primate Models and Genome Editing 77 


transgenes were segregated and suppressed across generations, expression levels 
could not be controlled, and the techniques were only suited to gain-of-function, 
not loss-of function, studies. Most human genetic diseases are caused by either point 
mutations or deletions of endogenous genes, which highlighted the need for new gene 
modification technologies for use against endogenous genes. 

Remarkable recent advances in genome editing technology have now made it 
possible to overcome these previous limitations (Sato et al. 2016). Genome editing 
tools, i.e., engineered nucleases, bind to a target genome sequence and introduce 
specific double-strand breaks. Double-strand breaks initiate cell-endogenous repair 
mechanisms such as homology-directed repair (HDR), non-homologous 
end-joining (NHEJ), and microhomology-mediated end-joining (MMEJ). Muta- 
genesis against endogenous genes can be introduced by taking advantage of such 
mechanisms. Zinc finger nucleases (ZFNs), transcription activator-like effector 
nucleases (TALENs), and the clustered regularly interspaced short palindromic 
repeat (CRISPR)/Cas system are mainly used as engineered nucleases. A number 
of genetically modified animals have already been generated using such restriction 
enzymes (Bedell et al. 2012; Geurts et al. 2009; Hauschild et al. 2011; Mashimo 
et al. 2010; Ochiai et al. 2010; Sung et al. 2013; Suzuki et al. 2013; Wang et al. 
2013; Yang et al. 2013). Among these, the CRISPR/Cas system was developed the 
most recently and is particularly promising (Cong et al. 2013; Mali et al. 2013). 

Using these genome editing technologies, we recently generated X-linked SCID 
model marmosets by knock-out of interleukin-2 receptor subunit gamma gene (Sato 
et al., 2016). Currently, we are now seeking to generate marmoset models of autism 
spectrum disorders, including Rett syndrome (Chahrour and Zoghbi 2007; Kishi 
and Macklis 2005) and tuberous sclerosis complex (Ess 2010; Fig. 1). Although a 
mouse model (male hemizygous Mech? mutation) is available for Rett syndrome, it 
does not necessarily mimic the critical symptoms. For example, while male hemi- 
zygous mice (Mecp2-/y) are used as model mice, Rett patients are exclusively 
female heterozygous in human. It is likely that males with MECP2 mutations are 
embryonic lethal in human, but not in mice. Furthermore, phenotypes appear at 
adult stages in mouse models, whereas symptoms become evident by | year of age 
in human Rett syndrome patients. New primate models that more closely mimic the 


Fig. 1 Generation of a knock-out marmoset by genome editing with ZEN or TALEN 
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clinical course of human disease may thus contribute to a better understanding of 
the pathogenesis and future treatments for neurodevelopmental disorders. 


Future Perspectives 


Genome editing has developed rapidly in recent years, leading to the production of 
genetically modified animals in many species. This technology has also been 
applied to non-human primates, and some groups have begun to report genetically 
modified macaques (Niu et al. 2014; Liu et al. 2014). Macaques offer a number of 
advantages, but it is difficult to expand colony size within a reasonable research 
period. We suggest that the common marmoset is thus a highly suitable alternative 
model primate for many areas of study, and the creation of knock-in/knock-out 
marmosets would help to introduce the benefits of this model to a larger community 
of researchers. Since germ-line-competent marmoset embryonic stem cells are not 
currently available, it is necessary to perform genome editing in one-cell stage 
embryos (fertilized eggs) to obtain knock-in/knock-out marmosets efficiently. The 
emergence of more sophisticated genome editing techniques will facilitate and 
accelerate the development of new gene manipulation technologies in marmoset. 
Marmoset models of disease generated using genome editing may contribute to the 
development of new therapeutic strategies for currently incurable neurodegenera- 
tive diseases and mental disorders. 
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Multiscale Genome Engineering: 
Genome-Wide Screens and Targeted 
Approaches 


Neville E. Sanjana 


Abstract New advances in genome engineering technologies, such as efficient 
programmable CRISPR nucleases, have enabled new advances in forward and 
reverse genetic studies. Here, I discuss recent work from our group combining 
top-down approaches like genome-wide loss-of-function screens and bottom-up 
approaches like disease variant modeling in human stem cells and stem cell-derived 
cortical neurons. 


Introduction 


Patient-sequencing studies have yielded large lists of disease-associated gene 
variants but it has been difficult to establish a causal role based solely on 
genetics. New methods are needed for rapidly understanding the effects of 
these variants and ascertaining whether the variants directly influence disease- 
related phenotypes. At the IPSEN meeting “Genome Editing in Neurosciences,” I 
presented two approaches (top-down and bottom-up) for harnessing new genome 
engineering techniques to decipher the roles of genetic variants in human health 
and disease. 

Top-down approaches utilize large-scale pooled libraries of genome- 
engineering reagents to start with a large, minimally biased hypothesis space and 
identify relevant variants via a single phenotypic selection (Fig. 1, right). In 
contrast, bottom-up approaches start with a handful of genetic variants nominated 
by strong genetic data, e.g., genome-wide association studies, case-control, family 
linkage studies, etc., and they examine a wide range of phenotypes (Fig. 1, left). 
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Fig. 1 Schematic diagram of the dynamic interplay between top-down and bottom-up genetic 
approaches. Top-down approaches (right) identify new gene candidates that can shape/reduce the 
space of hypotheses for more detailed bottom-up (left) cellular models and phenotyping. 
Top-down approaches are unbiased or minimally biased to cast a wide net of possible genetic 
hypotheses and use phenotypic selection to identify putative disease-associated gene variants. 
Bottom-up approaches focus on a smaller set of variants but usually provide a more detailed 
phenotypic analysis of different molecular/cellular/circuit aspects of each genetic variant. Candi- 
date variants for the bottom-up approach can be derived from either genetic evidence (e.g., patient 
sequencing studies) or from top-down approaches like genome-wide CRISPR screens 


Top-Down Approaches Using Genome-Wide CRISPR 
Screens 


The microbial CRISPR-Cas9 nuclease from S. pyogenes can be guided to specific DNA 
sequences using a 20 bp guide sequence. Given the short length of the guide sequence, 
we have been able to use oligonucleotide array synthesis techniques to create libraries 
of thousands of guide sequences in a pooled format. By designing pooled libraries to 
target all genes in a specific genome, we have created a new tool for functional genomic 
screens (Sanjana 2016). Genome-scale CRISPR knock-out (GeCKO) screens use the 
consistent phenotypic enrichment of multiple CRISPR reagents targeting the same gene 
to lend evidence to the gene’s role in a particular disease. Using a GeCKO library 
targeting ~18,000 genes with 64,751 guide sequences, we have found loss-of-function 
mutations that confer resistance to the BRAF inhibitor vemurafenib in human mela- 
noma cells (Shalem et al. 2014). Using a second-generation GeCKO library in a mouse 
model, we performed an in vivo screen to identify driver mutations that trigger 
metastasis to the lung (Sanjana et al. 2014; Chen et al. 2015). Recently, we have 
expanded the scope of CRISPR pooled screens to also include noncoding regions of the 
genome (Sanjana et al. 2016; Wright and Sanjana 2016). 


Bottom-Up Approaches Using Exome Sequencing in Autism 


Whole exome and whole genome sequencing have ushered in a revolution in 
identifying rare, disease-associated variants. Several whole exome sequencing stud- 
ies have pinpointed rare de novo variants associated with autism spectrum disorder 
by examining exomes from autistic individuals and comparing them to parental 
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exomes (Iossifov et al. 2012; Neale et al. 2012; O’Roak et al. 2012; Sanders et al. 
2012). The next logical step is to create relevant cellular models to better understand 
the mechanisms through which these variants work and to serve as a platform for 
drug screens and therapeutic testing. There are two major roadblocks for building 
these kinds of cellular models. The first one concerns gene editing and, over the past 
3 years, has become largely historical: Until recently, genome engineering in human 
stem cells and neurons has been challenging but transfection of CRISPR plasmids or 
ribonucleoproteins provides an easy, efficient technique for engineering human cells 
(Peters et al. 2008; Swiech et al. 2015). The second major hurdle has been neural 
differentiation. Common protocols to differentiate neurons from human stem cells, 
such as dual SMAD inhibition or embryoid body differentiation, require months to 
create mature neurons (Zhang et al. 2001; Chambers et al. 2009). Recently, we and 
others have demonstrated that viral overexpression of Neurogenin 1 or Neurogenin 
2 can rapidly drive stem cells into a homogeneous culture of mature cortical neurons 
(Zhang et al. 2013; Busskamp et al. 2014). These neurons display robust electro- 
physiological activity within just 2-3 weeks after the start of differentiation, making 
them ideally suited for synaptic assays, calcium imaging and neurophysiology. We 
are now moving forward with phenotypic analyses of de novo mutations in autism 
using the combined CRISPR-Neurogenin platform for rapid mutagenesis and human 
neuron profiling. 

Taken together, these new technologies in genome engineering—enabled in 
large part by CRISPR nucleases and related transformative methods—have 
improved our ability to perform forward and reverse genetic assays in relevant 
model systems. A major challenge for neuroscience is finding clear phenotypes that 
accurately reflect complex diseases, such as schizophrenia or autism. Despite these 
challenges, a combination of top-down and bottom-up approaches will pave the 
way for a clearer understanding of the human brain in healthy and disease states. 
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Using Genome Engineering to Understand 
Huntington’s Disease 


Barbara Bailus, Ningzhe Zhang, and Lisa M. Ellerby 


Abstract Huntington’s disease (HD) is a fatal, dominantly inherited neurodegen- 
erative disorder caused by a CAG trinucleotide expansion in the Huntingtin (HTT) 
gene, leading to an expanded polyglutamine (polyQ) region in the encoded protein 
HTT. We have used homologous recombination (HR) to genetically correct HD 
patient-derived induced pluripotent stem cells (IPSCs) and found that this reversed 
HD disease phenotypes. We have utilized exploited genome editing tools including 
TALENs (Transcription like activator effectors) and CRISPR (Clustered Regula- 
tory Interspaced Short Palindromic Repeats)/Cas9 technology to carry out genetic 
correction or expansion, and we were able to detect HR without selection in human 
cells. The overall goal is to use this technology to model HD-relevant cell types and 
better understand disease progression by leveraging system biology approaches. To 
understand the disease progression, isogenic iPSC lines were created. We found 
that the disease phenotypes only manifested in the differentiated neural stem cell 
(NSC) stage, not in iPSCs. Transcriptomic analysis of HD iPSCs and HD NSCs 
compared to isogenic controls was utilized to understand the molecular basis for the 
CAG repeat expansion-dependent disease phenotypes in NSCs. Differential gene 
expression and pathway analysis identified transforming growth factor D (TGF-B) 
signaling, netrin-1 signaling and medium spiny neuron (MSNs) maturation and 
maintenance as the top dysregulated pathways in HD NSCs. The ability to create 
additional isogenic cell lines through CRISPR-mediated HR will further enhance 
our understanding of HD progression. These lines can be manipulated with CRISPR 
to understand the effects of common SNPs (single nucleotide polymorphism) that 
modulate disease onset in HD, allowing the identification of new pathways and 
helping to elucidate potential therapeutic targets for HD. Beyond drug discovery, 
the CRISPR system could eventually be optimized to use in vivo, correcting a 
patient’s disease-causing mutation, in the asymptomatic stages of HD. 
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Huntington’s disease (HD) is a devastating, dominantly inherited movement and 
psychiatric disorder that is caused by expansion of a CAG trinucleotide repeat in the 
first exon of the Huntingtin gene (HTT), resulting in translation of an expanded 
polyQ repeat in the HTT protein. The production of the abnormal expanded polyQ- 
containing HTT protein leads to a dramatic loss of striatal and cortical neurons and 
pro-survival growth factors such as BDNF (brain derived neurotrophic factor) in 
HD patients. The polyQ expansion in the HTT protein leads to disrupted cellular 
homeostasis and activation of cellular death pathways (Fig. 1). Since the disease is 
inherited in an autosomal dominant fashion, each child of an affected parent has a 
50% chance of being affected. HD generally manifests in mid-life, with a mean age 
of onset of 35—45 years of age. The disease begins with cognitive disturbances and 
progresses to severe and debilitating motor symptoms (chorea) usually accompa- 
nied by psychiatric disturbances, with death following in about 15-20 years 
(Landles and Bates 2004). The current therapeutic approaches in HD focus on 
normalizing molecular pathways disturbed in HD or on lowering the levels of the 
mutant HTT protein (Canals et al. 2004; Conforti et al. 2008; Zuccato et al. 2008). 
To date none of these approaches are approved for use outside of clinical trials and 
they will not cure the disease. 

In this chapter, we discuss the use of gene editing tools to model neurological 
diseases such as HD as well as the potential to use this technology to treat genetic 
neurological diseases. 


CAGCAGCAG 


Aggregation and 
Functioning Toxic Fragments 
Protein of Mutant Protein 


Disrupted Homeostasis and Cellular Death 


Fig. 1 Illustration on the neuronal changes occurring in the striatum of a Huntington’s disease 
patient. The exon 1 CAG expansion in the HTT allele results in a mutant protein being formed; the 
mutant protein aggregates and is also cleaved into toxic fragments. The aggregates and the toxic 
fragments result in a disrupted cellular homeostasis and eventual neuronal cellular death in the 
striatum 
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Targeted gene editing has evolved dramatically in the last 25 years. While originally 
a technique that a handful of laboratories had mastered, it is now a common tool used 
in hundreds of laboratories around the world. One family of gene editing proteins is 
the customized zinc finger proteins (Segal and Barbas 2000; Wolfe et al. 2000; Pabo 
et al. 2001; Nagaoka and Sugiura 2000). These proteins were adapted for targeted use 
in the late 1990s (Liu et al. 1997; Segal et al. 1999; Dreier et al. 2001). Each zinc 
finger protein could be designed to recognize three different base pairs on DNA 
through various interactions between the proteins alpha helix amino acids and the 
DNA base pairs (Segal and Barbas 2001). To recognize a specific sequence of DNA, 
the zinc fingers could be attached to each other, with six zinc fingers recognizing a 
unique 18-base pair sequence in an organism’s genome. The zinc finger proteins 
could have effector or nuclease domains attached, allowing for gene regulation or 
gene replacement. The effector domains included VP64 for gene activation, KRAB 
for gene silencing and DNMT1 for methylation (Beerli et al. 1998; Rivenbark et al. 
2012). The nuclease domain could cut targeted genomic sites and allow for muta- 
genesis or homologous recombination at enhanced efficiency. Zinc finger proteins 
have been successfully used in human cells, animal organs and have reached Phase II 
human clinical trials (Geurts et al. 2009; Umov et al. 2005; SangamoBiosciences 
2001; Eisenstein 2012). Although promising, zinc fingers presented several chal- 
lenges for researchers. Their targeting ability was limited, they required specialized 
design techniques and they exhibited a frequent incidence of off-target events (Cornu 
and Cathomen 2010; Gupta et al. 2010; Gabriel et al. 2011). Some advances have 
been made to reduce the off-target potential and increase detection of these events 
(Zykovich et al. 2009; Cornu et al. 2008). The therapeutic potential of zinc fingers for 
a variety of diseases, including HD, continues to be explored by the biotechnology 
company Sangamo (Cornu et al. 2008; Wolffe 2016). 

In 2009, a new gene editing protein was described, transcription activator-like 
effectors (TALEs; Boch et al. 2009; Moscou and Bogdanove 2009). These proteins 
were originally characterized in Xanthomonas bacteria and represented a major 
advance for DNA regulating proteins. TALEs, unlike zinc fingers, made contact with 
individual DNA base pairs, which greatly expanded the sequences that could be 
targeted in the genome (Moscou and Bogdanove 2009). They were also much easier 
to design and assemble. Much like zinc fingers, TALEs could have effector or nuclease 
domains attached to the DNA binding domain, allowing for the DNA to be cut or for 
genes to be regulated (Christian et al. 2010; Maeder et al. 2013a, b; Cong et al. 2012). 
Promising experiments in a variety of organisms have validated the efficacy of TALEs, 
although no human clinical trials have begun. A recent publication has shown the 
ability of TALEs to specifically silence the mutant HTT allele in cell culture models or 
to engineer an allelic series into the HTT locus (Fink et al. 2016; Wang et al. 2013). The 
TALEs still exhibit off-target effects and may have potential immune issues (Guilinger 
et al. 2014). 

Gene editing became a widely accessible technology in 2012 with the charac- 
terization of the CRISPR system and its implications for targeted gene editing and 
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regulation. The CRISPR system is composed of a Cas9 nuclease and a gRNA 
complex. To cut the DNA, Cas9 attaches to the guide RNA (gRNA), which targets 
a specific site in the organism’s DNA (Jinek et al. 2012; Wiedenheft et al. 2012). 
This system is found in archea and bacteria and is used as a natural defense 
mechanism against bacteriophages. The system has been characterized and adapted 
for mammalian-targeted genome editing. The gRNA has one targeting requirement, 
a PAM motif (typically a NGG) at the 3’ end of the DNA targeting site; this 
sequence is common in DNA and thus almost any gene can be targeted with the 
CRISPR system (Gilbert et al. 2013; Qi et al. 2013). As with previous gene editing 
proteins, the Cas9 can be modified to either silence or activate gene transcription 
(Fig. 2; Sander and Joung 2014; Larson et al. 2013). Due to some initial off-target 
cleavage events, the Cas9 nuclease was modified to become a Cas9 nickase (Cas9n; 
Ran et al. 2013). This modification drastically increased targeting specificity, as the 
binding of two Cas9n proteins targeting two different DNA sites was required to 
make a double strand break in the DNA and encouraged homologous recombination 
(HR) with a potential donor DNA strand. Overall the off-target effects of Cas9n 
could be reduced to background levels (O’Geen et al. 2015; Wu et al. 2014). The 
modified Cas9n was found to have similar cleavage efficiency when two gRNAs 
were used, one targeted on each strand of the DNA, resulting in a double strand 
break. The technique has been widely adopted to create disease-modeling cell lines, 
rodent and non-human primate models and in non-viable human embryos 


Fig. 2 Illustrations of 
different CRISPR/Cas9 
uses with variable effector 
domains. The wild-type 
Cas9 nuclease can be used 
to initiate double strand 
breaks, encouraging 
homologous recombination. 
The inactive Cas9 (dCas9) 
attached to a DNMT3 can 
be used for site-specific 
methylation, resulting in 
semi-permanent gene 
repression. A dCas9 can 
have a KRAB domain 
attached for temporary gene 
repression or a VP64 
domain for activation 
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(Liang et al. 2015; Chen et al. 2015). What has made the CRISPR system so 
accessible is that, unlike the zinc fingers and TALEs, the same core protein, 
Cas9, is used to target any sequence, whereas the targeting portion of the CRISPR 
system, the gRNA, is what varies. The gRNA can be designed and synthesized 
either in a standard lab or by an outside company. This separation of the targeting 
portion (gRNA) of the CRISPR system from the modifying portion (Cas9 or other 
effectors) allows for targeting multiple genes in one experiment (Wang et al. 2013). 
The ability to target multiple genes in a single experiment drastically reduces the 
time needed to model complex genetic disorders in which more than one gene is 
involved. All of these unique characteristics have resulted in a rapid popularization 
of the CRISPR system in research labs, with thousands of papers having been 
published in the last five years. 


Uses for Gene Editing to Understand Human Diseases 


Due to their ability to precisely target a gene or regulatory element, genome editing 
tools have been widely utilized to model human diseases both in cells and in animals. 
Neurodegenerative diseases such as Parkinson’s disease and HD have been modeled 
by introducing disease-causing mutations into human induced pluripotent stem cells 
(iPSCs) facilitated by genome editing tools (O’Brien et al. 2015; Soldner et al. 2011). 
CRISPR/Cas9 or TALENs can also be injected into zygotes or embryos to get 
genetically modified animals. Researchers have injected TALEN-expressing 
mRNAs into zebrafish embryos to target the gene glucocerebrosidase 1, which is 
mutated in the lysosomal storage disorder Gaucher’s disease. The introduction of 
these TALENS caused a deletion mutation of the protein Glucocerebrosidase 1, and 
characteristics of the Gaucher’s disease were present in this zebrafish model 
(Keatinge et al. 2015). Duchenne muscular dystrophy (DMD) is a neuromuscular 
disorder caused by a loss-of-function mutation of the gene dmd. A DMD rat model 
was generated by delivering CRISPR system into rat zygotes to target the dmd gene 
(Nakamura et al. 2014). These disease models are valuable tools for the exploration 
of disease mechanisms and for the pursuit of therapeutics. 

When combined with human pluripotent stem cells, genome editing tools can 
provide some unique advantages in disease modeling and mechanism study. Human 
pluripotent stem cells, including iPSCs and embryonic stem cells, can be directed to 
any cell types of the human body with the correct differentiation conditions. Thus 
relevant cell types for the disease and changes in this development can be studied in 
these models. When genome editing tools are used to add or remove a mutation at the 
pluripotent stem cell stage, isogenic cell lines with an almost identical genetic back- 
ground are obtained. As cells are differentiated into more restricted stem cells and 
terminally differentiated cells, the isogenic background will persist. Phenotypic 
changes of these cells are most likely a result of the mutation, as they have an identical 
genetic background. However, one may still have to consider epigenetic changes and 
mitochondrial mutations that may remain harbored in the patient’s iPSCs’ background 
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(Chinnery et al. 2012; Calvanese et al. 2009). These isogenic cell lines can be subjected 
to systematic approaches including DNA microarray, RNA-seq and mass spectrometry 
for transcriptomic and proteomic information. Bioinformatic analysis can identify 
interesting gene/protein targets or signaling pathways that have distinct disease- 
associated patterns. The cleaner background of isogenic cell models should result in 
more relevant and reliable hits. After proper validation, these potentially important 
disease targets may lead to discovery of new mechanisms or drugs. 

Recent advances in stem cell research suggest that iPSCs may provide novel 
models of disease and new treatments for diseases. An isogenic iPSC line was 
established in the Ellerby lab through traditional means of HR on a human HD 
patient iPSC line. This isogenic line introduced a corrected donor strand for the CAG 
expansion and corrected the disease allele to a wild type allele (An et al. 2012). The 
isogenic corrected line had the exact same genetic background as the patient, 
reducing the genetic variables that are present when one compares disease pheno- 
types across multiple different patients to matched wild type individuals. One of the 
first questions we addressed was whether we could take HD patient-derived iPSCs 
and, through genetic correction of the disease allele, reverse disease phenotypes. 
Interestingly, we did not detect phenotypes in the undifferentiated HD iPSCs but only 
observed disease phenotypes in the differentiated neural stem cell (NSC) state, and 
these phenotypes were reversible upon genetic correction of the patient mutation. 

To understand the molecular basis for the CAG repeat expansion-dependent 
disease phenotypes in iPSCs and NSCs, RNA-Seq was performed comparing the 
isogenic corrected lines to HD iPSCs and HD NSCs. We observed that there were few 
phenotypic differences between HD and wild type iPSCs, but there were substantial 
differences—over 2000 dysregulated genes—in the NSCs. Some of the key pathways 
that were dysregulated included TGF-f, netrin-1 signaling and development of the 
striatum (Fig. 3; Ring et al. 2015). Particularly important, our isogenic HD-iPSCs 
with corrected alleles identified the maturation or maintenance of medium spiny 
neurons (MSNs) as being dysregulated (Ring et al. 2015). We showed that the 
pathways or factors that were involved in this process were therapeutic targets for 
HD (Ring et al. 2015). A subsequent publication from another group emphasized the 
de-differentiation of MSNs or loss of MSN identity in HD is a major source of 
dysfunction (Langfelder et al. 2016). These pathways offer new options for thera- 
peutic treatments and drug targets. Using genetic engineering, we generated an 
isogenic allelic HD iPSC series for HD modeling (CAG repeat of 21, 45, 72, 100). 
By creating additional isogenic lines, the contribution of the CAG expansion to the 
disease phenotypes can be elucidated from background variation; this information 
can help guide researchers towards additional treatment targets (O’Brien et al. 2015). 

Besides directly modifying the disease gene, genome editing tools can also be used 
to engineer cells to facilitate disease research by making reporter cell lines. In an effort 
to investigate the roles of a gene encoding a sodium channel subunit in epilepsy, a 
tdTomato fluorescence protein gene cassette was inserted into iPSCs under a 
GABAergic neuron-specific promoter with CRISPR/Cas9. When differentiated into 
GABAergic neurons, these cells were red fluorescently labeled and could be readily 
followed for electrophysiological studies (Liu et al. 2016). Another example is in the 
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Generation of Isogenic Cell Lines by CRISPR 
HD vs. Corrected 


RNA-Seg Analysis 


Signaling Pathway Analysis 


Targeted Therapy 


Fig. 3 A flow chart comparing the donor Huntington disease (HD) and genetically corrected 
isogenic iPSCs and NSCs. Transcriptomic analysis was performed on the cell lines in which 
significant differences were found in multiple signaling pathways. These newly identified path- 
ways could result in additional drug targets 


peripheral neuropathy Charcot—Marie—Tooth disease, type 1A. With TALENs a bio- 
luminescent reporter was integrated under the regulation of the disease causing gene 
pmp22, which allowed high throughput screening for reagents that can decrease 
expression of this gene (Inglese et al. 2014). In an effort to better track the recombi- 
nation repair efficiency in HD cells, the Ellerby lab has designed a myc-tagged donor 
strand that, when incorporated into the cell, is detectable by both Western blot and PCR 
amplification; these methods are so sensitive that recombination efficiencies can be 
detected at levels as low as 5% (Fig. 4). For polyglutamine disease, it is also possible to 
detect the prevalence of the polyglutamine expansion through the use of specific 
antibodies, which detect the expanded polyglutamine region (Fig. 4; An et al. 2014). 
The ability to qualitatively assess how many cells have been corrected will increase the 
field’s understanding of what may be a therapeutic level of correction for the disease. 
Having specific tags to monitor genetic correction rates and resulting phenotypic 
improvements will advance the field’s understanding toward designing genetic correc- 
tion and optimize treatment conditions. 
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Fig. 4 (a) Use of myc tag in corrected donor plasmid allows for both insertion screening at the 
DNA level by PCR (left) and at the protein level by Western blot (right); red triangles indicate 
expected band size. (b) Use of 1C2 antibody screening with an expanded donor plasmid, a rapid 
method to optimize different gRNA combinations for homologous recombination efficiency 


Gene Editing In Vivo to Treat Genetic Diseases 


With its extreme ease of use and targeting, the CRISPR system is being studied 
extensively with a goal of in vivo correction of genetic mutations. Recent advances 
have shown that it takes about 15 h for Cas9-mediated double strand breaks to be 
repaired; this is potentially due to Cas9 remaining bound to the DNA for an 
extended period of time and because it asymmetrically releases the target strand 
(Richardson et al. 2016). This asymmetric release of the strand has given 
researchers the ability to rationally design the donor strands in an effort to increase 
gene correction percentages; it also provides additional insight as to how to target 
and design the donor strands. The guide RNAs have also continued to evolve since 
the first characterization of the CRISPR system. Initially there were two compo- 
nents to the guide RNA, a crRNA and a gRNA, and these were able to be fused 
creating a simpler method in which the gRNA could be delivered already assem- 
bled. Multiple assembled gRNAs could be placed on the same plasmid, allowing for 
multiple gene targeting with minimal plasmids (Wang et al. 2013; Hsu et al. 2013). 
A couple of new CRISPR variants have been characterized that offer even lower 
off-target binding levels and are smaller (Ran et al. 2015). Both of these new 
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characteristics may be useful in eventual patient treatment, as a smaller CRISPR 
protein could be more easily packaged for delivery and lower off-target binding 
increases the specificity of the CRISPR protein, restricting the effects to the 
target site. 

The most exciting application of genome editing tools in human genetic diseases 
is genetic correction and normalization of those disease mutations. These have been 
done in cells. For example, in Myotonic dystrophy type 1, a genetic modification 
has been introduced by TALEN in a NSC model and this modification has shown 
some restoration of disease phenotypes (Xia et al. 2015). More encouragingly, 
genetic correction has been achieved in adult animals. Recently several groups 
published genetic correction in a mouse DMD model. Adeno-associated virus- 
delivered CRISPR/Cas9 was used to remove a mutation from the gene dmd. Partial 
phenotypic recovery has been observed in these studies (Xu et al. 2016; Nelson 
et al. 2016; Tabebordbar et al. 2016). The use of CRISPR in vivo to ablate the 
rhodopsin gene carrying the dominant $334ter mutation in rats with severe auto- 
somal dominant retinitis pigmentosa also highlights the use of genetic correction in 
disease (Bakondi et al. 2016). These proof-of-principle experiments may be the first 
steps towards overcoming many currently incurable genetic diseases. CRISPR 
technology is already being used in human cells and disease models with the 
eventual goal of patient treatment. A recent study conducted in China has even 
used CRISPR technology on non-viable human embryos (Liang et al. 2015). As this 
technology has advanced so rapidly, the scientific community has held a summit 
meeting to discuss the potential future of CRISPR technology, much in the same 
way the Asilomar Conference discussed recombinant DNA over 40 years ago 
(Baltimore et al. 2015; Berg et al. 1975a, b). 

In HD, it is possible that a variety of CRISPR tools could prove beneficial for 
treatment. Previous studies have shown that a reduction in mutant HTT levels can 
ameliorate symptoms of the disease (Canals et al. 2004; Conforti et al. 2008; 
Zuccato et al. 2008). A recent study has shown reduction of mutant Huntingtin in 
cells by using TALE-ATFs (artificial transcription factors) to specifically target the 
mutant allele by targeting SNPs common on that allele. The TALE-ATF has a 
KRAB domain attached that represses transcription of the mutant Huntingtin allele 
(Fink et al. 2016). This technique has yet to be tried in Huntington model mice; 
however, previous studies have used ATFs to repress transcription in the brains of 
mice (Bailus et al. 2016). Another approach using CRISPR would involve increas- 
ing transcription of genes that could be neuroprotective in HD; BDNF could be a 
potential target for this type of therapy (Pollock et al. 2016). As screening studies 
are further refined using more genetically engineered isogenic cell lines, it will be 
possible to uncover additional gene regulation targets. 

The ideal therapy for HD would involve gene replacement therapy, where the 
mutant allele would be replaced by a corrected donor allele. Using the CRISPR 
system, it will eventually be possible to do this correction in vivo. When designing 
the donor strand, it is possible to detect site-specific insertion by PCR if a small tag 
is added to the donor strand, allowing for optimization of different CRISPR 
components (Fig. 4). After design and condition optimization, there are still several 
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issues that need to be addressed to develop CRISPR into an in vivo therapy. One 
area to examine is the immune response, as Cas9 is not an endogenous protein in 
mammals, although there are mouse models that constitutively express Cas9 from 
birth (Platt et al. 2014). Previous studies in humans with zinc finger proteins have 
shown minimal immune response. Cas9 is not endogenous to animals and may 
elicit an immune response if given over an extended period of time. A second major 
concern for gene correction in vivo is the delivery of the CRISPR system to the 
desired organ or tissue. For certain diseases, it may be possible to directly inject the 
organ and correct only a subpopulation of the cells; for other diseases, especially 
those that effect the brain, delivery is more difficult (Liu et al. 2016; Yin et al. 
2016). Direct injection into the brain is possible, and packaging the CRISPR system 
into an appropriately pseudotyped viral vector could allow for additional coverage 
beyond the injection point. The CRISPR system has been packaged into both AAV 
and lentivirus and used successfully in several mouse studies (Yin et al. 2016; Senis 
et al. 2014; Wang et al. 2015; Graham 2016). Nanoparticles and purified proteins 
are additional methods that have been used to successfully deliver CRISPR into 
cells and tissues (Wang et al. 2016; Ramakrishna et al. 2014). Each of these 
delivery methods has advantages and disadvantages, but with additional optimiza- 
tion successful gene replacement therapy in vivo should be possible. Since early 
HD diagnosis is possible, genetic correction therapy could be performed during the 
asymptomatic stage, potentially preventing onset of the disease. 


Conclusion 


Genome engineering is providing neuroscientists with new methods to address 
critical questions in the field and offers the hope for new treatments of neurological 
genetic diseases. The application of genetic engineering to disease modeling is 
accelerating efforts to understand the molecular mechanism of these diseases and 
offers new approaches to identifying therapeutic targets and drugs. The recent 
advances in genetic engineering allow for better modeling and understanding the 
role of SNPs in diseases with complex genetic alterations. These new genomic 
engineering technologies, which precisely alter the genome, are already offering 
insights into the complexity of the nervous system, its normal function and alter- 
ations in disease. Eventually these genome engineering technologies may correct 
the disease allele in human patients (in vivo) before symptoms manifest, resulting 
in therapy at the DNA level. 
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Abstract Duchenne muscular dystrophy (DMD) is a devastating, degenerative mus- 
cle disease that affects ~1 in every 3500 male births. DMD arises from mutations in the 
DMD gene that prevent expression of its encoded protein, Dystrophin (Burghes et al. 
Nature 328:434-437, 1987). Interestingly, patients with Dmd mutations that delete 
certain segments of the Dystrophin coding region, but maintain protein reading frame, 
have a much milder form of the disease, known as Becker Muscular Dystrophy (BMD). 
This observation has spurred interest in developing “exon skipping” strategies in which 
certain mutation-containing or mutation-adjacent Dmd exons are intentionally removed 
in order to restore protein reading frame, and thereby Dystrophin expression, in DMD 
patients (Beroud et al. Hum Mutat 28:196—202, 2007; Yokota et al. Expert Opin Biol 
Ther 7:83 1-842, 2007). 


Recently our lab (Tabebordbar et al. 2015) and others (Long et al. 2015; Nelson et al. 
2015) reported a novel strategy to accomplish permanent sequence-specific modi- 
fication of the Dmd gene in vivo in the mdx mouse model of DMD. This strategy 
utilizes the Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)- 
Cas9 RNA guided gene editing system, delivered using adeno-associated virus 
(AAV) vectors. We showed that, when administered systemically, AAV-Dmd- 
CRISPR enables sequence-targeted genome modification in each of the key affected 
cell types and organs of DMD model mdx mice, including cardiomyocytes, skeletal 
muscle fibers and endogenous muscle stem cells (Tabebordbar et al. 2015). Gene 
editing in these cells restores Dystrophin protein reading frame and expression, 
recovers muscle contractile function, enhances muscle resilience in the face of 
controlled muscle damage, and establishes a pool of therapeutically modified pro- 
genitors that can participate in subsequent muscle regenerative events. 

These studies provided a critical advance in allowing programmable genome editing 
that can irreversibly modify disease-causing mutations in the affected tissues of dystro- 
phic individuals. Moreover, the results represent critical proof-of-concept evidence 
demonstrating the feasibility of systemic gene editing in vivo, which has the potential 
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to recover Dystrophin expression in up to 80% of patients with DMD (Beroud et al. 2007; 
Yokota et al. 2007). Yet, important challenges remain for the future therapeutic applica- 
tion of Dmd-CRISPR gene editing, including enhancing the efficiencies with which gene 
editing may be accomplished in muscle fibers and satellite cells and circumventing the 
possible emergence of a host immune response to the bacterial Cas9 endonuclease, which 
could interfere with gene editing and/or lead to elimination of gene-edited cells. Over- 
coming these challenges will be crucial for developing clinically relevant strategies to 
accomplish safe, efficient and durable in vivo gene editing for DMD. 


Duchenne Muscular Dystrophy 


Duchenne muscular dystrophy (DMD) is one of the most common X-linked genetic 
disorders in humans; it arises from point mutations, deletions or duplications in the 
DMD gene that prevent expression of its encoded protein, Dystrophin (Burghes et al. 
1987; Koenig et al. 1987). Dystrophin is an essential structural protein in skeletal and 
cardiac muscle (Ervasti and Campbell 1991). Its primary function is to link the 
cytoskeleton of muscle fibers to the extracellular matrix and thereby stabilize the 
muscle fiber membrane (Straub et al. 1992). Absence of functional Dystrophin protein 
increases the susceptibility of dystrophic muscle fibers to contraction-induced injury 
(Campbell and Kahl 1989). Increased cytosolic calcium following mechanical stress, 
activation of proteases (particularly calpains), destruction of membrane constituents 
and ultimately muscle fiber necrosis occur frequently in dystrophic muscles (reviewed 
in Tabebordbar et al. 2013). Due to continual myofiber destruction in dystrophic 
muscle, the resident pool of regenerative muscle stem cells (known as “satellite 
cells”) must support repeated rounds of activation and regeneration in an attempt to 
compensate for ongoing damage. As the disease advances, satellite cells show reduced 
capacity to regenerate muscle, possibly due to cell-intrinsic defects (Dumont et al. 
2015) or proliferation-induced reductions in telomere length (Sacco et al. 2010). Absent 
an adequate regenerative response, fat and fibrotic tissue replace muscle fibers, leading 
to further weakening and wasting (Wallace and McNally 2009). 


Current Gene-Targeted Therapeutic Strategies for DMD 


Current treatment options for DMD are disappointingly limited and focus mainly 
on managing symptoms and suppressing the immune and inflammatory response 
(Muir and Chamberlain 2009; Partridge 2011). Patients are typically diagnosed at 
3-5 years of age, they are wheelchair-bound in their second decade, and they have 
an average life expectancy of only about 30 years. In contrast, a related group of 
patients with mutations that impact this same gene but maintain its open reading 
frame produce an internally deleted but still partially functional Dystrophin protein 
that results in a markedly less severe disease known as Becker Muscular Dystrophy 
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(BMD; England et al. 1990; Nakamura et al. 2008; Taglia et al. 2015). Many BMD 
patients are not diagnosed until adolescence or even adulthood and some enjoy a 
normal life span. These observations have provided motivation for the generation of 
rationally modified, truncated versions of Dystrophin for therapeutic application in 
DMD, including engineered “microdystrophins” and endogenous exon “skipped” 
DMD mRNAs. 

The extremely large size of the DMD gene (2.4 Mb) and its encoded mRNA (14 kb) 
makes it very difficult, if not impossible, to package full-length dystrophin expression 
cassettes into clinically relevant viral vectors such as Adeno-associated viruses (AAVs), 
which have a packaging capacity of <5 kb. This limitation has propelled the generation 
of truncated mini- (6-8 kb) and microdystrophin (<4 kb) genes (Harper et al. 2002), 
which reduce the Dystrophin protein to its most essential functional elements. These 
rationally designed microgenes delete large regions of the internal Rod domain of 
Dystrophin, which contains 24 spectrin-like repeats and comprises 80% of the overall 
protein (Chamberlain 2002), while maintaining much of its functional integrity. 
Microdystrophin genes can be packaged into viral vectors for exogenous delivery and 
ectopic expression from ubiquitous or muscle-specific promoters (Fabb et al. 2002; 
Gregorevic et al. 2004), and delivery by AAVs results in effective expression of protein 
products that correctly localize to the sarcolemma and recruit other dystrophin glycopro- 
tein complex (DGC) proteins. Importantly, while microdystrophins are not equivalent in 
function to full-length Dystrophin, they have been shown to ameliorate DMD pathologies 
in mdx mice (Harper et al. 2002; Wang et al. 2000) and dystrophin-deficient canine 
models (Shin et al. 2013). A related approach—exon skipping—similarly generates a 
modified Dystrophin protein product, but in this case the endogenous Dmd pre-mRNA 
transcript is targeted to remove mutation-carrying and/or mutation-adjacent exons from 
the mRNA. By choosing specific exons for removal, exon skipping approaches are able 
to generate Dmd mRNAs with restored reading frame. 

In both gene complementation by microdystrophin and exon skipping approaches, 
the overall goal is to convert a severe DMD mutation, lacking Dystrophin protein 
expression entirely, into a milder BMD-like one, via expression of a truncated but still 
partially functional protein. It has been estimated that exon skipping strategies for Dmd 
could ultimately provide significant therapeutic benefit to the majority (~80%) of 
existing DMD patients (Beroud et al. 2007; Yokota et al. 2007), while complementa- 
tion by ectopic expression of microdystrophin could in theory be useful for any 
mutation that abrogates Dystrophin protein production. 


Challenges for Therapeutic Exon Skipping 
and Microdystrophin Delivery Strategies 


Clinical application of exon skipping approaches to date has relied on antisense 
oligonucleotides (AONs) designed to mask splice donor and acceptor sequences in 
mutation-affected or mutation-adjacent exons. However, for many of the therapeuti- 
cally relevant Dmd exons, exon-skipping AONs have not yet been developed or have 
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not progressed to clinical trials (https://www.sarepta.com/our-pipeline). In addition, in 
a recent clinical trial, AON-mediated skipping of Dmd exon 51 failed to achieve 
sufficient rescue of Dystrophin protein to meet predetermined clinical endpoints 
(Lu et al. 2014). Importantly, even with relatively stable chemistries (Goyenvalle 
et al. 2015), AONs have a defined half-life (Goyenvalle et al. 2015; Vila et al. 2015) 
and require repeated (weekly) administration. This need for recurrent treatment 
increases the cost and potential side effects of AON therapy. Also, delivery of AONs 
to cardiac muscle has been more challenging than delivery to skeletal muscle, and 
delivery to resident muscle stem cells, if it occurs, is unlikely to be effective due to the 
dilution of AONs that occurs during cell proliferation. Thus, any benefit from AON 
therapy in satellite cells would be lost during muscle regenerative responses, which 
require proliferation of satellite cells and their progeny. Strategies in which AONs are 
delivered virally, by embedding within small nuclear RNAs, appear to suffer from 
similar progressive loss of the viral genome and its encoded AONs from dystrophic 
muscles (Vulin et al. 2012; Le Hir et al. 2013). 

Relatedly, exogenous gene supplementation therapies using partially functional 
engineered microdystrophin constructs have encountered some challenges in clinical 
application. An initial Phase I clinical trial of microdystrophin gene therapies in human 
DMD patients yielded suboptimal transgene expression despite continued presence of 
vector genomes, possibly due to pre-existing or acquired T cell-mediated immune 
responses to dystrophin epitopes or AAV capsid proteins (see below), disease-associated 
inflammatory responses, CMV promoter silencing, or low AAV tropism (Bowles et al. 
2012; Mendell et al. 2010). Additional Phase I trials of microdystrophin therapies 
utilizing different gene regulatory elements and AAV serotypes are currently underway 
(ClinicalTrials.gov Identifier: NCT02376816), and may mitigate these concerns; how- 
ever, similar to AON delivery, delivery of AAV-microdystrophin to muscle satellite 
cells is unlikely to result in sustained transgene production, as the episomal AAV 
genome will be diluted with successive cell divisions. These challenges that have 
been encountered in the development of effective AON and microdystrophin therapies 
highlight the need for further evaluation of alternative strategies that could provide an 
efficient, permanent, one-time, systemic treatment to restore expression of Dystrophin in 
skeletal and cardiac muscles, as well as muscle satellite cells, of DMD patients. 


Gene-Editing Approaches to Restore Dystrophin Function 
in DMD 


In a recent report (Tabebordbar et al. 2015), we described a novel genome-targeted 
editing approach (Fig. 1), based on Dmd exon skipping approaches, that was designed 
to accomplish irreversible removal of a mutated segment of the Dmd gene in the 
affected tissues of mdx mice, an animal model of human DMD (Sicinski et al. 1989). 
We further showed that this approach resulted in production of functional Dystrophin 
protein and improved muscle stability and contractility (Tabebordbar et al. 2015). Our 
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Fig. 1 Gene-editing strategy for recovery of Dystrophin expression in DMD model mice. Mdx 
mice with a mutation in the Dmd gene were injected with AAV particles carrying clustered 
regularly interspaced short palindromic repeats (CRISPR)-Cas9 endonucleases and paired guide 
RNAs targeting the mutated Dmd exon23. This procedure led to excision of the targeted DNA and 
restored Dmd gene reading frame and Dystrophin expression in gene-edited skeletal muscle fibers, 
cardiomyocytes and muscle stem cells following local delivery or delivery via the bloodstream, in 
dystrophic mice. Gene-edited nuclei are shown in green and non-edited nuclei are shown in blue. 
The mutated Dmd mRNA is degraded and Dystrophin expression is lacking in the dystrophic 
tissues of untreated mice (graphical summary describes data reported in Tabebordbar et al. 2015. 
See text for details) 


approach made use of the CRISPR-Cas9 gene editing system, which allows the 
introduction of user-defined “cuts” in the genome. Each CRISPR-Cas9 gene editing 
complex consists of a Cas9 endonuclease and a programmable guide RNA (gRNA) 
that probes the genome for protospacer-adjacent motifs (PAM) [e.g., -NGG (Ran et al. 
2013a) or -NNGRR(T) (Ran et al. 2015)]. Upon PAM recognition and base-pairing of 
the gRNA with an adjacent complementary DNA sequence, Cas9 creates a double- 
strand break (DSB) in the genomic DNA. Introduction of DSBs at two sites in the same 
linear stretch of DNA favors excision of the intervening sequence, and repair of this 
lesion by non-homologous end joining (NHEJ) juxtaposes the remaining 5’ and 3’ 
sequences (Canver et al. 2014; Tabebordbar et al. 2015). Alternatively, inclusion of a 
homologous donor template enables repair by homology directed recombination 
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(HDR), leading to incorporation of precise nucleotide changes, encoded in the donor 
template, at the site of the DSB. Changes introduced by HDR can range from a single 
base pair to insertions of entire genes or even large cassettes of multiple genes (Urnov 
et al. 2005; Ding et al. 2013; Voit et al. 2014). Significantly, the relative activity of 
NHEJ and HDR repair mechanisms can vary with cell type, cell cycle and develop- 
mental stage, which can have important ramifications for the efficacy and outcome of 
therapeutic genome modification (Yang et al. 2016). 

CRISPR-Cas9 RNA guided endonucleases (RGENs) have been used to target both 
expressed and non-expressed genes in multiple cell types from multiple organisms both 
in vitro (Cho et al. 2013; Cong et al. 2013; DiCarlo et al. 2013; Ding et al. 2013; 
Friedland et al. 2013; Hwang et al. 2013; Mali et al. 2013; Wu et al. 2016) and in vivo 
(Ding et al. 2014; Xue et al. 2014; Yin et al. 2014; Ran et al. 2015; Yang et al. 2016). 
Published data demonstrate the utility of this system for multi-organ gene targeting of 
many distinct cell lineages, including hepatocytes, muscle fibers, cardiomyocytes, and 
muscle regenerative stem cells (Long et al. 2015; Nelson et al. 2015; Ran et al. 2015; 
Tabebordbar et al. 2015; Yang et al. 2016). We adapted the CRISPR-Cas9 system for 
Dmd editing in cardiac and skeletal muscle in vivo by utilizing a smaller Cas ortholog 
from Staphyloccocus aureus (SaCas9), which could be packaged into recombinant 
AAV particles using the muscle-tropic serotype 9 (Zincarelli et al. 2008). Our strategy 
(Fig. 1) employed a dual AAV system (termed “AAV-Dmd-CRISPR’’), which, due to 
AAV packaging limitations, was superior to single vector systems in terms of gene 
editing efficiency (Tabebordbar et al. 2015). In the dual system (Tabebordbar et al. 
2015), the first AAV delivers SaCas9, driven by a strong CMV promoter, whereas the 
second AAV carries two gRNAs that target sequences in the introns flanking mouse 
Dmd exon 23 (“Dmd23 gRNAs”), each driven by a U6 promoter. This targeting of 
intronic sequences is important because it allows for tolerance of small insertions and 
deletions that are common with NHEJ-mediated repair of DNA DSBs (Symington and 
Gautier 2011). When injected intramuscularly or systemically into adult (P42) or early 
postnatal (P3) recipient mice, which carry a nonsense mutation (mdx) in Dmd exon 
23, AAV-Dmd-CRISPR caused excision of exon 23 in heart cells (cardiomyocytes), 
skeletal muscle fibers and muscle stem cells [satellite cells, marked by transgenic 
expression of the fluorescent zsGreen protein from the Pax7 promoter (Bosnakovski 
et al. 2008)], producing an exon 23-deleted Dystrophin mRNA that, when translated, 
generated a truncated but functional Dystrophin protein (Tabebordbar et al. 2015; 
Fig. 1). Dystrophin protein restoration in AAV-Dmd-CRISPR treated mdx mice 
improved structural and functional aspects of the muscle, increased muscle strength 
and improved resistance to eccentric contraction-induced damage. Importantly, 
AAV-Dmd-CRISPR gene editing complexes could be disseminated systemically and 
were functional in both neonatal and adult mice. Exon-deleted transcripts represented 
almost 50% of total Did mRNA in muscle after intramuscular delivery in adults and 
5-15% in skeletal and cardiac muscles after systemic delivery in neonates 
(Tabebordbar et al. 2015). 

Importantly, and emphasizing the robustness and reproducibility of these results, 
similar outcomes were reported simultaneously by two other groups (Long et al. 2015; 
Nelson et al. 2015) using different Cas9 proteins and regulatory elements (Long et al. 
2015), different AAV serotypes (Nelson et al. 2015), different routes of systemic 
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administration (Long et al. 2015), and different gRNAs (Long et al. 2015; Nelson et al. 
2015). All three groups reported gene editing in skeletal muscle fibers and 
cardiomyocytes, with efficiencies in skeletal muscle reported by Long et al. and Nelson 
et al. to vary from 1 to 67% Dystrophin + fibers, depending on the delivery approach 
used (local vs. systemic), dose of virus and age of the recipient animals. Long et al. also 
documented Dmd modification in vascular smooth muscle cells but not in brain, and 
our group, as discussed above, demonstrated detectable editing in endogenous muscle 
satellite cells (Tabebordbar et al. 2015). Finally, by analyzing treated muscle tissues at 
4, 8, and 12 weeks after AAV injection, Long et al. ascertained that the percentage of 
dystrophin-positive myofibers might increase over time, and Nelson et al. observed that 
dystrophin restoration could be maintained for at least 6 months after treatment, 
indicating the potential long-term efficacy of AAV-Dmd-CRISPR therapies. Promis- 
ingly, differences in experimental design among the three studies and the varying 
efficiencies obtained suggest that multiple parameters may be adjusted and optimized 
to enhance genomic editing and increase dystrophin protein expression levels for more 
effective treatment of disease phenotypes by Dmd-CRISPR. 

In summary, published work from our lab and others provides strong evidence 
supporting the efficacy of in vivo genome editing to correct disruptive mutations in 
DMD in a relevant dystrophic mouse model (Long et al. 2015; Nelson et al. 2015; 
Tabebordbar et al. 2015). These data indicate that programmable CRISPR com- 
plexes can be delivered locally and systemically to terminally differentiated skel- 
etal muscle fibers, cardiomyocytes and smooth muscle cells, as well as regenerative 
muscle satellite cells, in neonatal and adult mice, where they mediate targeted gene 
modification, restore Dystrophin expression and partially recover functional defi- 
ciencies of dystrophic muscle. As prior studies in mice and humans indicate that 
Dystrophin levels as low as 3-15% of wild type are sufficient to ameliorate 
pathologic symptoms in the heart and skeletal muscle (van Putten et al. 2012, 
2013, 2014; Long et al. 2014), and levels as low as 30% can completely suppress 
the dystrophic phenotype (Neri et al. 2007), the level of Dystrophin expression that 
is potentially achievable by one-time administration of AAV-Dmd-CRISPR urges 
further development of this system, which could be used independently or together 
with other therapies, including AON-mediated exon skipping (Aartsma-Rus et al. 
2009) and AAV-mediated delivery of engineered “microdystrophins” (Harper et al. 
2002; Ramos and Chamberlain 2015), as discussed above. 


Remaining Challenges for Therapeutic Development 
of DMD-CRISPR 


Taken together, the rodent studies described above provide strong pre-clinical proof-of- 
concept data that should inspire further evaluation and optimization of AAV-CRISPR 
as a new therapeutic option for DMD patients, either as a stand-alone intervention or in 
conjunction with other existing DMD therapies. Below, we discuss a number of 
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challenges that remain to be overcome before realizing the potential of this approach in 
human patients. 


Challenges of DMD-CRISPR Delivery 


Engineered recombinant AAVs are particularly attractive vectors for both local and 
systemic delivery of gene editing complexes due to their general non-pathogenicity in 
human populations, their relatively low immunogenicity, and their inability to integrate 
efficiently into the genome (Gao et al. 2004; Boutin et al. 2010). Because of these traits, 
AAVs are currently in use in several human clinical trials (Mingozzi and High 2011; 
Kotterman and Schaffer 2014), and the immune response to AAV vectors has been 
extensively studied in both animal models and humans. Because engineered AAV 
vectors do not replicate and do not encode viral proteins, immune responses to AAVs 
are directed solely at the viral capsid and exhibit a relatively low pro-inflammatory 
profile (Mingozzi and High 2011). While pre-existing and acquired immunity to AAV 
remains a challenge for systemic, and repeated, administration of AAV vectors in 
human populations, these issues have been investigated for several decades and 
promising pharmacologic and physical strategies have emerged (Mingozzi and High 
2011). In addition, clinical responses to AAV administration have been monitored in 
hundreds of human subjects, with little evidence as yet of acute adverse events 
(Mingozzi and High 2011). Thus, the successful application of AAV-mediated therapy 
in multiple human trials suggests that the immune response to AAV itself is unlikely to 
preclude gene editing therapies based on AAV delivery. 

Still, a clear limitation of current AAV systems is that levels of gene targeting 
achieved in mouse models by AAV-mediated delivery of CRISPR-Cas9 to muscle 
satellite cells are rather low (<5% of satellite cells targeted; Tabebordbar et al. 2015), 
suggesting a need to investigate additional AAV serotypes to identify those with 
optimal tropism for satellite cells. Directed evolution and in vivo selection have been 
used recently to engineer novel AAV capsids with high tropism for tissues that are 
difficult to transduce with naturally occurring AAVs, such as human hepatocytes in a 
xenograft liver model (Lisowski et al. 2014) and the outer retina after injection into the 
eye’s vitreous humor (Dalkara et al. 2013). In addition, transduction rates of blood- 
forming hematopoietic stem cells have been improved through incorporation of novel 
amino acid substitutions in capsids (Song et al. 2013a, b). Thus, the application of 
directed evolution and in vivo selection strategies for generating novel AAV serotypes 
with high tropism for satellite cells represents an exciting future direction for increasing 
gene-editing efficiencies in these cells in vivo. 

On the other hand, the development of alternative delivery strategies that enable 
transient expression of DMD-CRISPR may hold some advantages, particularly since 
the therapeutic effect of gene-editing approaches does not depend on persistent expres- 
sion of Cas9 and gRNAs. Transient expression of CRISPR components could mitigate 
several of the possible adverse effects associated with prolonged Cas9 exposure, 
including potential genomic toxicity and immunogenicity (Wang et al. 2015). Indeed, 
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in vitro experiments indicate that transient expression of Cas9 does produce lower 
off-target effects (Kim et al. 2014; Zuris et al. 2015). 

Recent advances in lipid nanoparticle-mediated delivery of Cas9:gRNA com- 
plexes in vitro (Kim et al. 2014; Woo et al. 2015; Zuris et al. 2015) and Cas9 mRNA 
in vivo (Yin et al. 2016) provide additional promising avenues that may circumvent 
the challenges of AAV immunity. Delivering Cas9 and gRNAs conjugated with cell 
penetrating peptides (CPPs) has also been useful in targeting gene-editing com- 
plexes to human cell lines in culture (Ramakrishna et al. 2014), and combining this 
approach with incorporation of novel muscle-homing peptides (Gao et al. 2014) 
could potentially be effective for in vivo delivery of DMD-CRISPR. 


Potential Immune Response to Restored Dystrophin Protein 


A possible immune response to the repaired DMD protein is also of potential 
concern for clinical application of DMUD-CRISPR-mediated gene editing; however, 
due to large variations in the types of DMD mutations seen in patients (Aartsma-Rus 
et al. 2009), it is likely that the nature of individual immune responses to Dystrophin 
protein will vary as well and will depend at least in part on the nature of the mutation 
and the frequency with which “natural” exon skipping, which gives rise to revertant 
fibers in both DMD patients and mdx mice (Hoffman et al. 1990; Burrow et al. 1991; 
Klein et al. 1992; Nicholson et al. 1993; Fanin et al. 1995; Uchino et al. 1995; Lu 
et al. 2000), may allow for endogenous exposure and tolerance to near-full length 
Dystrophin. Interestingly, in gene therapy trials for hemophilia B, in which AAV 
vectors were used to deliver Factor IX (F. IX), no subjects developed immune 
responses against the F.IX transgene, even though some carried null mutations in 
the F.IX gene (Manno et al. 2006; Nathwani et al. 2011). Similarly, promising results 
from studies using “microdystrophin” in mice and primates suggest that this protein 
is effectively expressed for up to 5 months without overt T cell or cytokine responses 
(Rodino-Klapac et al. 2010). These data argue that acquired immunity against the 
therapeutic protein also may not be therapy limiting. On the other hand, results from 
a clinical trial using intramuscular AAV-mediated delivery of microdystrophin, 
expressed under the control of a ubiquitous CMV promoter, revealed the presence 
in some patients of T cells recognizing self and non-self Dystrophin epitopes 
(Mendell et al. 2010). Interestingly, these T cells were present both before and 
after vector injection in two of the six patients, raising the possibility that screening 
for pre-existing immunity to Dystrophin protein in larger cohorts of DMD patients 
could provide useful information relevant to patient inclusion and exclusion criteria 
in future trials. Anti-Dystrophin antibodies were not detected in any of the treated 
patients; however, detection of Dystrophin-specific T cells and a lack of transgene 
expression in muscles of patients injected with AAV-microdystrophin (with the 
exception of two patients analyzed 6 weeks after injection) may suggest a cytotoxic 
response against fibers expressing microdystrophin. Thus, currently available data 


112 M. Tabebordbar et al. 


point to a compelling need for further studies to investigate more deeply the potential 
immune response to restored Dystrophin expression in dystrophic muscle. 


Pre-existing and Acquired Immunity to Cas9 


Potential immunity to the Cas9 endonuclease is also a significant consideration for 
therapeutics development in humans. An essential component of the CRISPR- 
based gene-editing machinery, Cas9 is a bacterially derived protein whose expres- 
sion in transduced cells can evoke both humoral and cellular responses (Wang et al. 
2015; and see below). Additionally, about 20% of individuals in the human 
population are persistent carriers of Staphylococcus aureus and another 60% have 
been periodic carriers at some point in their lives (Kluytmans et al. 1997). Thus, a 
significant fraction of potential patients is likely to have been exposed to the Cas9 
protein from this species, raising the possibility that a pre-existing anti-Cas9 
immune response could modulate the efficacy of CRISPR-mediated gene editing 
for recovery of Dystrophin expression in dystrophic muscles. Moreover, as emerg- 
ing data suggest that the immune system and its products can modulate the 
expression of AAV-encoded transgenes (Mingozzi and High 2011), as well as 
components of cellular DNA damage response pathways (Jackson and Bartek 
2009; Calvo et al. 2012), Cas9-induced immune responses could potentially alter 
both the degree of on-target DMD editing and the frequency and types of off-target 
modifications induced. Thus, while studies in our lab and others (Long et al. 2015; 
Nelson et al. 2015; Tabebordbar et al. 2015) clearly demonstrate that anti-Cas9 
immunity does not preclude gene editing in vivo, Cas9 immune responses could, 
nevertheless, have profound implications for the persistence of therapeutic benefit 
in the muscle and other tissues. We therefore believe that it is particularly important 
at this juncture to begin to assess the nature and consequences of the immune 
response to the foreign Cas9 protein itself and to determine whether preventing or 
ameliorating this response might improve the efficiency, durability, repeatability 
and/or safety of Cas9-mediated therapeutic gene editing. 


Assessing Mutagenic Events at On-Target and Off-Target 
Sites 


Off-target modifications pose a potential threat for gene editing approaches because 
the unintended activity of CRISPR-Cas9 at these locations can cause pathogenic 
modifications that impair cellular function or promote tumorigenesis. Furthermore, 
because in general Cas9-induced DNA DSBs can be repaired by either HDR or 
NHEJ, editing can result in different outcomes, depending on the number of alleles 
affected and the type of modification introduced. Thus, it is critical to develop tools 


Therapeutic Gene Editing in Muscles and Muscle Stem Cells 113 


that enable facile assessment of mutagenic potential in an un-biased genome-wide 
manner, since such evaluations are likely to show patient- and gRNA-specific 
variation. 

Recent advances have developed several different strategies to reduce genome-wide 
off-target mutations of Streptococcus pyogenes Cas9 (SpCas9). These strategies 
include use of paired SpCas9 nickases (Ran et al. 2013b), gRNAs with reduced length 
of the guide sequence (Fu et al. 2014) and the engineering of SpCas9 variants with 
amino acid substitutions in the DNA binding domain that reduce off-target rates 
(Kleinstiver et al. 2016; Slaymaker et al. 2016). Yet, there is still need for improving 
the specificity of SpCas9 and its smaller orthologs (e.g., SaCas9), and this issue is 
particularly important for the targeting of muscle stem cells, which have substantial 
proliferative capacity. The risk of generating undesired and deleterious mutations at 
proto-oncogene loci or at loci critical to stem cell function by CRISPR-Cas9 transduc- 
tion of these cells must be rigorously analyzed before proceeding further with clinical 
translation of gene editing for DMD. 


Enabling HR for Precise Repair of Dmd 


Prior work in mice demonstrates that DMD pathology in skeletal muscle can be 
reversed by transplantation of sorted muscle stem cells isolated from wild-type 
animals (carrying a normal copy of the Dmd gene; Cerletti et al. 2008; Sacco et al. 
2010). However, muscle stem cells are extremely rare, cannot be expanded effec- 
tively ex vivo, must be delivered by intramuscular injection (as they fail to migrate 
to muscle tissue when injected intravenously), and do not engraft cardiac muscle, 
which also is affected by Dmd mutation. These significant complications have 
limited the application of stem cell transplantation therapy to DMD, despite 
promising results in individually injected muscle groups. 

Likewise, as discussed above, recent reports document the feasibility of 
AAV-based delivery of gene-editing complexes into cardiac and skeletal muscle 
in vivo and demonstrate that this system could be used to specifically excise a mutated 
segment of the Dmd gene in mdx mice to restore Dmd reading frame and allow 
production of a partially functional Dystrophin protein that improves muscle stability 
and contractility (Long et al. 2015; Nelson et al. 2015; Tabebordbar et al. 2015). 
However, it is important to note that the “first-generation” gene-editing strategies 
applied in these studies do not produce a full length Dystrophin. Instead, these 
approaches generate an internally truncated protein analogous to that seen in patients 
with BMD. While BMD is a markedly less severe disease compared to DMD, BMD 
patients still experience muscle pathology, and so, while clearly providing a potential 
clinical benefit, this approach is not fully curative for DMD. For this reason, future 
studies should be aimed at achieving full restitution of Dystrophin protein expression 
through precise gene editing to restore the normal Dmd gene sequence. Importantly, as 
conventional wisdom holds that HDR is limited to proliferative cells and DSBs 
introduced into post-mitotic cells (e.g., muscle fibers) will be repaired instead by 
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NHEJ, it is likely that achieving precise repair of the Dud gene will require efficient 
co-delivery into muscle satellite cells of CRISPR/Cas9, Dmd-targeting guide RNAs, 
and donor DNA template to direct HDR. Such a feat will, in turn, likely necessitate the 
identification of novel or optimized delivery vehicles that exhibit high satellite cell 
tropism (see above). While certainly challenging, success in such an approach would 
represent a very promising treatment strategy for DMD. 


Gene-Editing Therapy in Combination with AONs 
or Microdystrophin 


AAV-mediated delivery of expression constructs encoding for AONs has been 
shown to enable widespread exon skipping, restoration of Dystrophin protein 
production and improvement of muscle function in short-term animal studies 
(Goyenvalle et al. 2004, 2012; Denti et al. 2006; Le Guiner et al. 2014); however, 
long-term studies in a more severe mouse model (Le Hir et al. 2013) and also in the 
Golden retriever model of DMD (Vulin et al. 2012) revealed that vector genomes 
are lost from dystrophic muscle upon muscle damage and also over time. This 
observation can be explained by injury-induced loss or degeneration of muscle 
fibers that previously were transduced by the AAV vector and subsequent incorpo- 
ration of new satellite cell-derived nuclei to the muscle. Importantly, as noted 
above, the low rate of satellite cell transduction with the AAV serotypes tested 
thus far, together with the likelihood that these cells and their progeny proliferate 
prior to incorporation into muscle fibers, makes it doubtful that additional vector 
genomes are delivered to muscle via fusion of satellite cell progeny in this system. 
Consistent with this, acute muscle damage by cardiotoxin injury of AAV injected 
mouse dystrophic muscle results in rapid loss of vector genome from the muscle 
(Le Hir et al. 2013). Irreversible gene correction of regenerating satellite cells and 
their progeny, achieved by gene editing, has the potential to overcome this chal- 
lenge. Moreover, to avoid immune response complications related to 
re-administration of AAV, non-viral delivery of DMD-CRISPR to dystrophic 
muscle could potentially be used to complement viral delivery of AONs or 
microdystrophin to achieve long-term and persistent Dystrophin restoration. 


Possible Application of CRISPR-mediated gene editing 
Strategies in Other Diseases 


The reprogrammable targeting of the Cas9 endonuclease via easily constructed 
gRNAs presents the exciting possibility of utilizing this system to treat a wide 
range of genetic diseases. Results from Dmd targeting by AAV-CRISPR in mdx 
mice are most immediately pertinent to other muscle disorders that are likewise 
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amenable to mRNA splicing modulation, i.e., exon skipping or exon retention 
strategies, conventionally achieved by AONs. These disorders include primary 
dysferlinopathies, such as limb-girdle muscular dystrophy type 2B, resulting from 
mutations in the large dysferlin protein coding region that may be skipped 
inconsequentially if contained in redundant C2 domains (Wein et al. 2010). 
AONs also have been used in spinal muscular atrophy (SMA) to interrupt the 
function of an intronic splicing silencer that would otherwise result in the 
omission of exon 7 of the survival motor neuron 2 (SMN2) protein product, 
thereby allowing compensation for the loss-of-function of its paralog, survival 
motor neuron 1 (SMN1) in SMA patients (Burghes and McGovern 2010). Fur- 
thermore, the use of AAV-CRISPR as an AON alternative is suitable for 
non-muscle specific diseases like Leber congenital amaurosis (Maeder and 
Gersbach 2016). 

Aside from complementing and potentially superseding the use of AONs in exon 
exclusion strategies, NHEJ-mediated DNA excision is applicable more generally for 
the targeted removal of specific genomic elements associated with disease. For 
example, chemokine receptor 5 (CCR5) is a critical human immunodeficiency 
virus type 1 (HIV-1) co-receptor that is necessary for the fusion to and infection of 
cells by CCR5-tropic virions (Broder and Collman 1997). Mutations in the CCR5 
gene can confer immunity to HIV-1 infection, and transplantation of hematopoietic 
stem cells carrying the same mutated gene has been aggressively pursued as a 
possible curative treatment (Allers et al. 2011). By using Cas9 and paired gRNAs, 
researchers have been able recently to selectively mutate the CCR5 gene and thereby 
provide resistance of immune cells to HIV-1 infection (Kang et al. 2015; Mandal 
et al. 2014). Moreover, CRISPR-Cas9 can be used to directly target and disrupt 
integrated proviral genomes (Vulin et al. 2012; Ebina et al. 2013; Kennedy and 
Cullen 2015; Wang et al. 2015). Other uses may include the removal of excess 
nucleotides in trinucleotide repeat disorders (Park et al. 2015) and the knock-out of 
proprotein convertase subtilisin/kexin type 9 (PCSK9) involved in hypercholesterol- 
emia (Ding et al. 2014; Ran et al. 2015; Wang et al. 2016). Finally, approaches 
utilizing co-delivery of CRISPR components with a donor DNA template to correct 
mutations via activation of the HDR pathway are also currently under development to 
treat cystic fibrosis (Schwank et al. 2013), hemophilia A (Park et al. 2015), hereditary 
tyrosinemia (Yin et al. 2014), sickle cell disease (Orkin 2016), severe combined 
immunodeficiency (Booth et al. 2016), and other, predominantly loss-of-function 
genetic diseases. 


Conclusions and Perspective 


Three independent studies have provided evidence for AAV-mediated delivery of 
CRISPR components targeting Dmd and restoring Dystrophin expression in 
dystrophic cardiac and skeletal muscle (Long et al. 2015; Nelson et al. 2015; 
Tabebordbar et al. 2015). One study (Tabebordbar et al. 2015) also showed Dmd 
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gene targeting in dystrophic muscle stem cells. Correction of Dmd in dystrophic 
satellite cells provides a critical reservoir of myogenic progenitors capable of 
producing Dystrophin-expressing muscle fibers and represents a potential advan- 
tage compared to conventional transgene-mediated gene therapy. Transgenes 
delivered by AAV are generally maintained as non-replicating episomes and 
thus are diluted during expansion of satellite cells and their myoblast progeny. 
In contrast, CRISPR-mediated gene editing allows for irreversible modification of 
Dmd in satellite cells and their progeny, a result that is even more advantageous if 
the gene-corrected cells are selected for, or enriched, in dystrophic tissue. Expan- 
sion of clusters of naturally occurring Dystrophin-expressing revertant fibers in 
mdx muscle, which depends on muscle regeneration, suggests that such a selec- 
tive advantage may exist for Dystrophin-expressing satellite cells in dystrophic 
muscle (Yokota et al. 2006). It would be interesting to test if gene-corrected 
satellite cells are selectively enriched in dystrophic muscles after induced muscle 
degeneration and regeneration. Furthermore, it would be informative to examine 
whether permanent gene correction of dystrophic satellite cells (and their prog- 
eny) prevents the loss of Dystrophin-expressing nuclei in muscle fibers, which is 
typically seen with traditional gene therapy approaches (Vulin et al. 2012; Le Hir 
et al. 2013). Minimizing off-target activity of Cas9 nuclease, analyzing potential 
immune responses against CRISPR components and therapeutic gene products 
and developing non-viral delivery approaches for transient expression of DMD- 
CRISPR in dystrophic muscle will also be important to help to move gene editing 
technology towards clinical application for DMD. In addition, it is important to 
keep in mind that the efficacy and safety of this approach in non-rodent dystrophy 
models is yet to be studied. Canine models of DMD, including the golden 
retriever muscular dystrophy (GRMD) model, exhibit more severe dystrophic 
phenotypes that show greater similarity to human DMD phenotypes than the 
mdx mouse model (Kornegay et al. 2012). Therefore, preclinical studies in dog 
models might better indicate the therapeutic potential of in vivo gene editing for 
DMD. The recently developed human muscle xenograft model also provides a 
unique and informative opportunity for studying the efficacy of DMD-CRISPR in 
correcting mutations in human dystrophic muscle fibers and satellite cells in vivo 
(Zhang et al. 2014). Finally, to assess the likelihood of vertical transfer of gene- 
editing events to the next generation after systemic gene editing, germline and 
also transplacental transmission of AAV-CRISPR should be rigorously analyzed. 
AAV9 has been shown to penetrate the placenta (Picconi et al. 2014) in mice, a 
finding that should be taken into consideration for planning clinical application of 
this technology. Still, the possibility to directly modify the human genome to 
correct deleterious mutations that lead to devastating human diseases, such as 
DMD, presents unprecedented promise for the future of regenerative medicine. 
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